More on Six Sigma and Data Quality

In two previous blogs, I wrote about intersections between Six Sigma, internal control and data quality. By way of background information, my department performs compliance functions, where we monitor information delivered by third parties and created through internal operations.

For example, we receive property-address information and derive new information like geospatial position through “geocoding” processes. Since we monitor data for compliance, our process inputs are hundreds of data elements, and our output is systematic, timely determination of whether data quality is acceptable or needs improvement. Anyone who works in the “governance, risk and compliance” arenas will appreciate the unique challenges associated with designing and optimizing monitoring processes. Characterizing measurement error is a constant challenge.

In a future blog, I will share our experiences with assessing measurement precision (Gage R&R) and understanding stability and bias. Here I want to focus on an immediate challenge, and request insight on best practices. Our compliance monitoring is very new. We are designing processes that will move compliance to the front end of our value chain, so we measure data quality at the “point of truth” and reconcile these data to points of consumption by risk, control and compliance functions. Our focus right now is on designing, piloting and calibrating our compliance monitoring.

Our approach is highlighted in my earlier blog on Six Sigma and data quality. We are beginning to produce expected-versus-actual defect rate observations for our critical data elements. These statistics are generating lots of interest and questions about how we define an expected defect rate (voice of customer) and determine the importance of a lower-than-expected defect rate (the focus of my writing). Two questions perenially come up:

First, does a lower-than-expected defect rate indicate a high, medium or low level of risk? Some critical data elements are more important than others and more sensitive to variance. Second, how do we come up with a risk rating?

We are now beginning to explore these questions. One approach would determine the financial risk associated with an unfavorable variance in data quality. Our enterprise risk management processes have not matured to the point, where a reliable methodology is available to us. A broader perspective would consider the reputational risk associated with an unfavorable variance in data quality. Other than benchmarking internal data quality against our industry, judgment prevails because methodological scoring for reputational risk is not feasible. In practice, risk assessment frameworks seem to offer broad criteria or rules of thumb, whereby we can draw conclusions about risk exposure.

Another challenge is connecting these criteria to defect-rate observations. We are exploring alternative tools, including FMEA. Your insight about the following will be appreciated:

Are there best practices for assessing risk or cost of poor data quality? Are these best practices applicable to measurement observations?
Are there lessons to be learned from manufacturing settings (e.g., techniques to estimate risk of product liability or cost of poor quality from raw-material defect rate observations)?
How are companies using FMEA to assess process risks, based on process metrics? After all, data quality is a type of process metric.

Your comments are encouraged, or please email me at [email protected].

More on Six Sigma and Data Quality

About the Author

Charles McKinney