How Does the CLT Apply to Capability Analysis?

Six Sigma – iSixSigma Forums General Forums Methodology How Does the CLT Apply to Capability Analysis?

Viewing 6 posts - 1 through 6 (of 6 total)
  • Author
  • #247299


    Hi there,

    I have a question about how the Central Limit Theorem applies to normal Capability Analysis. My current employer has a WI which states, and I’m paraphrasing, that based on the CLT, a minimum sample size of n ≥ 30 (randomly and independently sampled parts) is required for a capability analysis. The stated rationale is that generally speaking, n ≥ 30 is a “large enough sample for the CLT to take effect”.

    My point of confusion is that the CLT deals with distribution of mean values, not individual samples. So for the CLT to “take effect”, would you not need to pull n number of sample, and perform x number of samples pulls, so that you end up with a total of n*x parts. Then the distribution of mean values of the each sample pull gives an approximately normal distribution. But I clearly can’t use mean value in calculating process capability.

    I realize the infamous “30 samples” is somewhat arbitrary in statistics and considered a rule of thumb, so I’m not question if 30 samples is sufficient for a capability analysis. I’m questioning the rationale that was given to me and how the CLT applies to capability analysis when the CLT deals in the distribution of Mean values, while the Cpk uses a single Mean value derived from individual samples in it’s calculation.

    Maybe I’m missing something? Any insight would be appreciated!




    Robert Butler

    The short answer is the CLT doesn’t have anything to do with the capability analysis.

    You are correct – the CLT refers to distributions of means not to the distribution of any kind of single samples from anything.  The statement from the WI (whatever that is)  “based on the CLT, a minimum sample size of n ≥ 30 (randomly and independently sampled parts) is required for a capability analysis. The stated rationale is that generally speaking, n ≥ 30 is a “large enough sample for the CLT to take effect”.”  comes from a complete misunderstanding of the CLT.  As for the second part about the 30 samples rule of thumb – that’s not worth much either.

    If you need a counter-example to help someone understand the absurdity of that statement consider the case where the choices are binary and you take a sample of 30 and get a group of 0’s and a group of 1’s.  If you make a histogram of your measures you will have two bars – one at the value of 0 and the other at the value of 1 and no matter how hard you try that histogram cannot be made to look normal.

    For a capability analysis you have to have some sort of confirmation that the single samples you have gathered exhibit an acceptable degree of normality.  The easiest way to do this is to take the data and plot it on normal probability paper and apply the fat pencil test.  If it passes that test you can use the data to generate an estimate of capability.

    If it doesn’t, in particular if the plot diverges wildly from the perfect normal line and exhibits some kind of extreme curvature, you should consult Chapter 8 in Measuring Process Capability by Bothe. It is titled “Measuring Capability for Non-Normal Variable Data”.  That chapter provides the method for capability calculations when the data is non-normal.



    Hi Robert,

    Thank you for your response! That definitely clarifies it.

    Sorry about the WI confusion. Stands for Work Instruction.




    Mike Carnell

    @Timmaaay I don’t normally say anything after Robert @rbutler posts because when it comes to the numbers you aren’t going to get better advice that what you get from Robert. He has this down pretty solid and he understands application not just theory.

    I am not trying to make excuses for who ever wrote the WI. You might want to check out the difference between Cpk and Ppk. The within and between group variation. I could see how someone could get to a point involving the CLT.

    If you are using Minitab the help section does a good job of explaining this.

    Just my opinion


    Thomas Pyzdek

    @Timmaaay  While I agree with @Mike-Carnell and @rbutler, I might also mention that there is a place for the CLT in process capability analysis. Namely, the CLT can be useful when assessing process stability. X-bar control charts will show normal-ish patterns if the subgroup sizes are large enough, and 30 is certainly large enough. Back in the days of data scarcity subgroup sizes of n => 30 were unheard of, but these subgroup means will certainly be normally distributed for processes in statistical control. Therefore, if you are using rational subgroups, these charts should show statistical control and the runs on the chart should follow normal data patterns. For example you might consider such large subgroup sizes for call centers where millions of data points are available for very skewed data such as call handle times. Plot data for sufficiently representative time periods, look for special causes when points are out of control, and perform PCA when the charts are stable. Use non-normal process capability analysis methods (or just count defects) on the data from stable processes.


    Chris Seider

    Control charts are great…just always don’t folks to think they are doing capability analysis on means which is where CLT applies.  Capability analysis is always done on individual readings.

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic.