Companion Article

This is one of two articles by David Wetzel that explore the value of developing a data management plan as the intial step in the Measure phase of the Six Sigma DMAIC methodology. The other article is “Data Management Plans Can Improve Collection/Validation.”

Long project cycle times, frequently cited as an impediment to Six Sigma success, can be accelerated through the use of data management plans. When employed as the first step in the Measure phase of the DMAIC process, the plan provides Six Sigma project teams with a highly structured planning tool. It not only helps teams ask all the right questions in the Measure phase, but also gives insight into potential analysis and project validation activities that will follow in later phases. A data management plan (DMP) is the pivotal document for project success, second only to a good scope charter.

DMP – Pivotal Measure Document

The data management plan should be developed either during a training week or in a project team meeting. It typically takes a team about two to four hours to complete. The DMP can reduce project cycle times significantly because a plan of attack – which navigates a phase in which many project teams get lost – has been quickly generated and agreed upon. The plan helps the team to characterize, balance, validate, graphically display and stratify its main project metric. As shown in Table 1, the DMP has four major sections: data description, data collection and validation, graphical display and project validation, and analysis. 

Table 1
1. Data Description
Metric (Unit of Measure) Operational Definition
(Verbal or Formula)
Family of Measure Data Type
Setup time (minutes) Time elapsed between last good part to acceptance of new part


Raw materials dimensions (indices) Process capability Cp/Cpk per print


Scrap (units) Non-conforming parts



2. Data Collection and Validation
Data Source or Location Collector Sampling Plan MSA Plan
Log Operator 1/Set-up Paired t-test and time study
Tag Receiving inspector 1/Lot Gage R&R
Routing Operator 1/Batch Paired t-test
3. Graphical Display and Project Validation

Of Central Tendency and Variation
Of Time Ordered Data Main Desirability Comparative Experiment
Histogram IMR Decrease mean Median test and f-test
Histogram Xbar and sigma More capable Gage R&R and 2 sample t-test
Bar chart p chart Decrease mean Paired t-test and attribute MSA
4. Analysis
Stratification Factors (ANOVAs, et al)
By machine and operator
By critical dimension
By month, center and machine

Even before one can discuss collecting and validating data, understanding the data description – or the data inputs and outputs – is critical for a successful data management plan. 

Data Description

Fully defining process parameters (input) or product characteristics (output) is the beginning of data-based decision-making. Every metric has three components which are required to fully describe it. Identification of these three components should be: conducted with the entire core and extended team so any confusion is eliminated; reviewed by the project sponsor, financial analyst and mentor; and reviewed with any other relevant stakeholders. Completely describing the main project metric communicates to all stakeholders exactly what metric the project is trying to improve and what balancing or countermeasures are being employed.

The three components required to fully describe and communicate main project metrics are the:

  1. Metric itself: The attribute, variable or rate that is changing. This is one word (i.e., time, rate, temperature, weight, concentration, speed, pounds, index).
  2. Descriptor: The information needed to specifically identify the metric so it is not confused with others. The descriptor should be kept simple. Imagine describing the main project metric to your parents, a child or a complete stranger.
  3. Unit of measure: This is the count or scale (percent, count, units, dollars, centimeters, rpm). The unit of measure should be surrounded by brackets. It clearly identifies the correct “family of measure” further on.

This section of the DMP requires the team to ask three key questions designed to fully describe the metrics: operational definitions, family of measure and data type. There are many ramifications and benefits for answering these questions thoughtfully and accurately. These answers give clues to further activities as far down the project road as the Control phase. 

Operational Definitions

The first question, operational definitions, can be handled in one of two ways: 

  1. As a description in words, using a clear, precise description of the factor being measured.
  2. As a mathematical formula, an operational calculation or “an expression” in symbolic logic.

 One way to facilitate this effort and to prove that clear operational definitions are important is to have everyone anonymously write their interpretation of the main project metric. How is the metric calculated and explained? Responses to this question often surprise team members who instantly note the inconsistency, creativity and misunderstandings that exist amongst even those closest to the project metric and its effects. A benefit of clear operational definitions will be further explored in measurement system analysis, where poor operational definitions can, at best, confuse and, at worse, completely kill, a project. 

Family of Measure

The second question to be considered at this point in the DMP is family of measure (FoM). FoM is a much underutilized concept. There are four basic families of measure – productivity, financial, quality and timeliness. Simpler versions are often used (i.e., quality, cost, speed) but the fuller version, containing all four metrics, is the most beneficial. FoM includes the four metrics, plus the use of hybrids, which are combinations of any of the two. The hybrids are rates, readily identifiable from percentages, unit-less measures, or numerators/denominators. Each FoM answers a different question and has a different unit of measure. It can be argued that differentiating between productivity and quality is what Six Sigma was originally all about. It is the story behind the story when someone explains the Taguchi loss function. It is what Motorola was after when it started using process capability indices to describe “how well” its processes, products and services were. Process capability and customer satisfaction indices are very different from productivity metrics, and represent true quality metrics. Table 2 helps clarify the differences between the FoMs. 

Table 2

Family of Measure


Unit of Measure




How many?


Counts, units produced, scrap, rework, repair



How much?


Cost, budgets, benefits, profits



How well?


Customer satisfaction, process capability indices



How fast?


Cycle time, duration, dwell time, shipping time


Any Combination

At what rate?


Labor rates, yield, return rates

An important use for FoM is to balance project measures – ensure that a team does not focus on just one measure at the expense of others. For example, speed up the assembly line at the expense of lower yields or shorten call time goals at the expense of the customer experience. FoM also can be used to identify project types. For example, a project type within productivity can focus on increasing yield or reducing scrap, while a project type within quality can focus on improving customer satisfaction or reducing variation. 

Some organizations use a FoM matrix to help decide project selection. For example, if the Six Sigma effort is internally driven by productivity and inventory initiatives, few customer and quality projects are going to be selected. 

Data Type

The third question teams should ask at this point in the DMP development focuses on data type. One of the biggest problems with data type is the many different ways to describe it, ranging from the scientific, statistical way to the layperson’s way. Helping team members distinguish between variable data (where they measure the variable) and attribute data (where they simply count it) can be useful. 

  • Continuous or variable data – Any variable measured on a continuum or scale that can be infinitely divided
    • Examples: time, weight, temperature, speed
    • Always preferred over discrete/attribute data
  • Discrete or attribute data – A count, proportion or percentage of a characteristic or category
    • Nominal (name)
    • Ordinal (order)

Conclusion: Why Project Teams Should Care

The million dollar question for project teams is: “Why are we going to all this trouble to describe the main project metric?” The most important reason why teams should fully describe their metrics and understand what they are improving in their process is simple: Procedures and methods differ by data type. Understanding data descriptions is the first step in a successful Six Sigma deployment.

About the Author