Cpk vs Ppk Calculation with Johnson Transformation
- August 1, 2018 at 3:23 pm #56057
Hi, I’m currently evaluating a process where my data don’t fit a normal distribution. I performed a distribution identification test and found that a 3-parameter weibul distribution was the best fit for a non-normal distribution.
Normally I’d prefer to use a non-normal distribution to analyze my data (Ppk); although, since the ‘goodens of fit’ test showed a P-value very close to my acceptable error (alfa = 0.05); I tried to calcualte Cpk using the johnson transformation to normalize my data; and compare the Ppk calcualted with non-normal distributed data, against the Cpk calcualted with data normalized with the johnson transformation.
I’m using Minitab for this analysis and found out that process capability, using the johnson transformation, cannot calculate cpk, but only Ppk. Box-Cox transformation allows both Cpk and Ppk calculations, unfortinately, this transformation was not powerful enough to normalize my data.
I understand that Cpk requires data to fit a normal distribution; so, if I’m using the johnson transformation to normalize the data, shouldn’t a Cpk caluclation be valid?
I can store the transformed data and do the Cpk analysis myself, but I believe there is an evident reason why Cpk can’t (or shouldn’t) be calculated on data normalized by a johnson transformation.
Sorry if this topic has been addressed before, but I couldn’t find a thourough explanation to why johnson transformation cannot allow for Cpk analysis.
Thanks.August 1, 2018 at 3:38 pm #202880
Rather than try to force you data to look more or less normal and spend a lot of time trying to transform and back transform why not just calculate the equivalent Cpk for non-normal data? The book Measuring Process Capability by Bothe has all of the details on how to do this in Chapter 8 whose title is “Measuring Capability for Non-Normal Variable Data”. You should be able to get this book through inter-library loan.August 1, 2018 at 4:00 pm #202881
Thanks @rbutler. I’ll look into that book.
I already have analyzed the data as non-normal distribution (weibull distribution as my best fit). I’ll look into that book to better understand the topic.
I was just curious on how different the results would be when analyzing the data as normal vs non-normal.
In any case, is it valid to calculate Cpk with data normalized with a Johnson transformation, as it is for Box-Cox?August 1, 2018 at 4:16 pm #202882
I’m afraid I don’t have an answer to that question. I’ve never done either. My GUESS would be it wouldn’t matter. The problem is how are you going to present your Cpk in terms of your original distribution. If you run a capability analysis on transformed data the chances are very good you will not be able to back transform to actual units. What this means is all of your reporting will be in transformed units and I do know from personal experience this does not work very well. People don’t understand that you have transformed the data, they don’t understand the Cpk is in terms of the transformed data, and they don’t understand you can’t apply the Cpk to the actual system measurements.
One of the other problems I have with box-cox and Johnson transforms (and that is all of the box-cox and Johnson transforms) is the issue of the change of transform parameters over time. As you gather more data there is a good chance your transform parameters will change which means you have to go back, recalculate the new parameters and then adjust all of your reporting from the past to make it match with the present.
Of course, you can have a process where the addition of more data does not have a major impact on the parameters but if you don’t then you may find yourself up a well known tributary without sufficient means of propulsion.August 2, 2018 at 6:22 am #202883
Thanks for the feedback @rbutler I can relate to the situations you mentioned. I’m sticking up with my non-normal analysis.August 2, 2018 at 5:01 pm #202884
@andreslp Just because you can transform data doesn’t mean you should transform data. This link is to an article that was written by Bill Hathaway. Bill is the founder of Moresteam, one of the leading suppliers of on eLearning for Six Sigma and Lean. You should find this interesting.August 3, 2018 at 1:03 am #202885
Have you considered asking why your data isn’t normal? Sometimes we get so hung up in the minutiae of analysis that we fail to see the underlying root cause.August 3, 2018 at 7:09 am #202888
Thanks a lot, @mike-carnell, I found that article truly helpful.
@Straydog Yes, I observed a particular behaviour in the process in which at steady time intervals, a significantly low value is obtained, thus, resembling a Weibull distribution.
This pattern is consistent and easily detectable through a control chart. I am still trying to identify the variable that may be responsible for this particular variation to see if it can be controlled to make the process more predictable, or in a better state of statistical control.
You must be logged in to reply to this topic.