Wheeler devote a whole chapter to this.

Wheeler, Donald J. and Chambers, David S., Understanding Statistical Process Control, SPC Press, 1992 ]]>

Statistically make sense to use transformations but they are of very limited value on the shop floor.

Transformation should not be the last resort but should be not at all. ]]>

Control charts for variable data requires normality so it is wise to check normality first. Otherwise, one may think data are unstable when in reality may be the result of following a skewed distribution. In fact, this advice is valid for any statistical analysis; always check shape first, because it will condition the method.

First, check normality and then, if data are normal, check stability using an SPC chart. If data were stable, then calculate Cp/Cpk.

If data were not normal, you may use a non parametric test like run chart to determine stability. If data were stable, then may calculate capability using DPMO.

If data are not stable, don’t proceed until fixing the process.

Data in this context means ‘data point’ no matter whether it is a single reading or a sample average.

Normality is not a requirement to determine process capability. Stability is. ]]>

Yes, I’ve advocated and worked where ppm defective would be converted to a Cpk or Ppk. I’ve never been as comfortable with “the closest spec” since the other side could be just as bad.

ppm defective and equivalent Z is so much more descriptive without histograms.

We’re aligned.

]]>I know what you mean about the transforms and totally agree. It should be a last resort after stability is demonstrated. The capability number attained is of little practical value, however is would allow a QE to get by a companies requirement for a capability number and move on to problem solving. You would be appalled of how huge an issue this was at my previous company and how much time was wasted by QE’s.

This is vary vary rarely a problem when the process is stable and in control. ]]>

I am horrified if I interpret you to say–Box Cox transform just to get data “normal” to get a Cpk, Ppk. Much better techniques including using ppm defective.

The cars example is nice visual of course.

]]>1. Stable process

2. Large sample size

3. Normal distribution

When these assumptions are not met the values are not valid. If your process is not stable, the results will be meaningless. Most capability indices estimates are valid only if the sample size used is ‘large enough’. Large enough is generally thought to be about 30 independent data values. The indices that we considered thus far are based on normality of the process distribution. This poses a problem when the process distribution is not normal. Without going into the specifics, we can list some remedies.

Transform the data so that they become approximately normal. A popular transformation is the Box-Cox transformation. There are other indices that apply to non-normal distributions. One statistic is called Cnpk (for non-parametric Cpk). For additional information on non-normal distributions, see:

Johnson and Kotz (1993). Process Capability Indices, Chapman & Hall, London.

There is, of course, much more that can be said about the case of non-normal data. However, if a Box-Cox transformation can be successfully performed, one is encouraged to use it.

A method to test whether a given data set can be described as “normal” is the normal probability plot. If a distribution is close to normal, the Normal Probability Plot will be close to a straight line. Minitab and many other common software packages report the Anderson- Darling statistic. The null Hypotheses for this test is that the distribution is normal thus for data to be Normal, the P-Value must be greater than 0.05 (typically).