# non-normal distribution of CPK

ROSS
Hello:
What can i deal with a un-normal continuous data(even after box-cox transformation ) when we are calculating PPK and CPK?
And as we all know, Minitab default consider the raw data is long-term data, then get the CPK and PPK. If we only have the short-term data, how can we analysis the Minitab result(can we use the PPK as CPK).

Cone
When you only have short term, the Pp and Ppk are actually your Cp and Cpk.
As far as non normal data, understand why it is non normal.

ROSS
Anyway, we may have the non-normal raw when we are calculating the CPK and PPK, then, how can deal with it?

Cone
Post your data here. I will let you know what is going on. I don’t need details, just how the data was taken.

ROSS
OK! Pls see attachment.
one     two601.4 598.0601.6 599.8598.0 600.0601.4 599.8599.4 600.0600.0 600.0600.2 598.8601.2 598.2598.4 599.4599.0 599.6601.2 599.4601.0 599.4600.8 600.0597.6 598.8601.6 599.2599.4 599.4601.2 599.6598.4 599.0599.2 599.2598.8 600.6601.4 598.8599.0 598.8601.0 599.8601.6 599.2601.4 599.4601.4 600.0598.8 600.2601.4 600.2598.4 599.6601.6 599.0598.8 599.0601.2 599.8599.6 600.8601.2 598.8598.2 598.2598.8 600.0597.8 599.2598.2 599.8598.2 601.2598.2 600.4601.2 600.2600.0 599.6598.8 599.6599.4 599.6597.2 600.2600.8 599.2600.6 599.0599.6 599.6599.4 600.4598.0 600.0600.8 599.0597.8 599.6599.2 599.4599.2 599.2600.6 597.8598.0 600.4598.0 599.6598.8 600.0601.0 600.8600.8 600.4598.8 599.4599.4 599.0601.0 598.4598.8 599.0599.6 599.6599.0 598.8600.4 599.2598.4 599.6602.2 598.6601.0 599.8601.4 599.6601.0 599.2601.2 599.6601.4 600.2601.8 599.8601.6 599.6601.0 600.0600.2 599.6599.0 599.2601.2 598.6601.2 599.6601.2 601.2601.2 599.6601.0 600.2601.0 600.0601.4 600.0601.4 599.4598.8 599.8598.8 599.2598.8 599.6598.2 599.4601.8 600.0601.0 600.0601.4 599.2601.4 599.4599.0 599.6601.4 599.8601.8 599.0601.6 599.6601.2 599.4

ROSS
soory, subgroup size is 5

Robert Butler
Both the normal probability plot and a histogram of “one” indicate that you have a bimodal distribution – this strongly suggest a special cause – data from two different production lines, two different suppliers, etc.
The normal probability plot of “two” looks bimodal and a histrogram of the data certainly looks that way.  In addition to looking like a mix of data from two different sources “two” gives the impression that the sources are not generating the same kind of data distribution.  It gives the impression that one source may have approximately normal data and the second is tending to a log normal.
My first thought would be to go back and check your data source.  As it stands your data does not look like it is proper data for computing any kind of a capability estimate.

ROSS
Robert Butler:
Thanks very much, to see the raw data and get some finding is very helpful.
But sometime we can not get some finding from histogram and we may not get futher information from raw data research. So, Minitab provide two choice for non-normal distribution data, one is capability analysis(weibull), but it can only provide PPK; another is box-cox transformation, if ok, it can provide both CPK and PPK, but sometimes we may fail, then how can we do after all the effort above(raw data research;box-cox transformation…)

ROSS
Hello:
Pls some expert explain attached question. Thanks you!
“But sometime we can not get some finding from histogram and we may not get futher information from raw data research. So, Minitab provide two choice for non-normal distribution data, one is capability analysis(weibull), but it can only provide PPK; another is box-cox transformation, if ok, it can provide both CPK and PPK, but sometimes we may fail, then how can we do after all the effort above(raw data research;box-cox transformation…)”

Dear Tony,
If the data you posted is collected in time series and is with subgroup of 5, just go ahead with X bar R chart to understand the process special causes and forget about the normal stuff as for a subgroup of 5 the central limit theorem would take care of the non normal issue.More ever if you are not making a control chart how can you calculate the Cp or Cpk?Pls get help from a good MBB having good fundamental knowlwdge.

Robert Butler
It appears that you have the same problem with both of the data sets. For ease of discussion I’ll confine my comments to data set “one”.  The data is very bimodal.  If I understand you correctly each data point represents an average of a subgroup of 5.  A plot of the data you provided on “one” show that your averages of 5 exhibit a strong bimodal distribution. As Shree observed you can still control chart such data but it’s my understanding that you are interested in trying to compute a measure of process capability using this data.
What the data is telling you is that your process has problems that need to be addressed before you attempt to compute any kind of capability index. Box-Cox transforms work on skewed data with long tails.  Your data does not meet that critera so there is no point in attempting to use it.  Similar issues surround the other Minitab choices. Simply put, unless you are under some kind of extreme pressure to do otherwise, don’t bother with a capability calculation at this time.
If you do have to meet some unavoidable demand with respect to a capability calculation I would recommend that you do the wrong thing and use the standard Cpk computation.  Unless you have some very generous limits the Cpk will be terrible. This will at least answer whomever is asking for such a calculation. So that both you and they do not loose sight of why the Cpk is terrible and why it is the wrong thing to compute, you should document and present your calculations.  Your report should include an analysis of the data complete with histograms and normal probability plots. It should list the things that you have checked and it should outline a plan of action for additional work. Properly presented, a report of this type may get you the backing you need for a proper investigation of the process.

