# should cpk always lesser than ppk

Six Sigma – iSixSigma Forums Old Forums General should cpk always lesser than ppk

Viewing 13 posts - 1 through 13 (of 13 total)
• Author
Posts
• #30741

sudarshan Kumar
Member

should cpk always lesser than ppk.  I have an software NWQ Analysist and when i calculate ppk it shows ppk around or greater than 1.67 and cpk always greater than ppk ie above 2.0 or more. my consultant says that cpk will be lesser than ppk always. i enter only the datas and remaining all is calculated by the software. plz suggest me is the software right or my consultant.

0
#80463

Mikel
Member

Sudarshan,
Your consultant is a bozo, go get a good consultant.
Go read the definitions of Cpk and Ppk. Cpk uses short term data, Ppk uses long term. Long term is always worse than short term which results in a Ppk which is less than Cpk.
This is all true unless you post under the name of Withheld. Withheld has Ppk = Cpk.

0
#80464

John J. Flaig
Participant

Stan,
Unfortunately some software packages report the sample estimate Cpk(hat)  as Cpk the population value. These two things are not the same as Cpk has sigma in the in the denominator whereas Cpk(hat) has the sample standard deviation. If fact, statistically, Cpk(hat) is NOT an unbiased estimator of Cpk.
Further, it is possible for Ppk(hat) to be greater than Cpk(hat), this is because the confidence intervals for the two estimators overlap. However, when the sample size gets large enough this situation goes away (sampling error goes to zero) and in the end we get the relation between the parameters Ppk <= Cpk always.
John J. Flaig, Ph.D.
Applied Technology (e-AT-USA.com).

0
#80466

mcleod
Member

To support Stan.  It boils down to the use of the sum of squares method for standard deviation to compute Ppk and the estimated (short term) method to calculate the Cpk.  The Ppk is more sensitive to shifts and drifts in the process and will usually yield a lower number.  The two should be used together along with the Pp and Cp to give you a good idea of your process and what is happening there as far as is it centered and does it have random noise or special cause variation.

0
#80470

Mikel
Member

John,
Anytime range is used as a surrogate for standard deviation, there is a possibility of overestimating. It is a statistical anomoly.
The real thing the poster needs to know is that long term standard deviation is always larger than short term. That we do not know how to accurately capture the number is not relevant.
Reality is that Ppk < Cpk always. Forget the equal sign you put in, it does not happen.

0
#80474

John J. Flaig
Participant

Stan,
The issue has nothing to do with using the range to estimate sigma. You could (and should) use the sample standard deviation. The only difference would be the width of the confidence interval for the estimators. This is because the range is a more inefficient estimator of sigma.
I understand your belief that there will always be a non-zero value for long-term variation. However, I can easily construct a case (and so can you) were variation between subgroups is zero. It is true that such cases would be very rare in practice but the probability of their existance is NOT zero. Therefore, the equal sign stays.
Regards,
John J. Flaig, Ph.D.
Applied Technology (e-AT-USA.com)

0
#80487

Hemanth
Participant

Hi,
Your software is right and consultant is wrong. Why? Lets go back to your data collection methodology, what do you do? you collect 3-5 samples which are very closely produced and this forms your one subgroup. Why do you do this? In a control chart the objective is to find shift in your process, and for this your between subgroup variation should be higher than the within subgroup variation, thus in order to keep within subgroup variation as low as possible we collect samples which are produced in most likely similar conditions. Why am I telling all this? because when you calculate Cpk you consider only the between subgroup variation (Tolerance/6*std deviation where std deviation is equal to Rbar / d2), whereas in Ppk you consider the overall variation (ie. withing subgroup variation + between subgroup variation, RMS of deviation from the mean for individual points). Thus the denominator is higher when you are calculating ppk, following this logic we would find that ppk will be smaller than the cpk. But, if you are not sampling properly and your within subgroup variation is very high than the between subgroup variation then maybe you would find ppk higher than cpk.
So most often you will find your cpk higher than ppk but its not true always..
hope this further confused you (just joking..!!)

0
#80488

Gabriel
Participant

John and Stan:
Give me a “pretty stable” process with a “pretty normal” distribution, and I will give you 50% of the times a “calculated Ppk” (or Ppk(hat)) that is grater than the “calculated Cpk” (or Cpk(hat)), with the calculated Cpk being grater the other 50% of the times.
In fact, you don’t need to give me such a process. There are a few processes like this where I work. In those processes, we calculate “within subgroup variation” (Rbar/d2, used for Cpk) and the “total variation” (S of the whole sample, used for Ppk). We do it monthly, Then we plot both “variations” on a timeline, and we also plot in the same chart what we call the “assumed variation” (the variation that defines the control limits, or what we assume to be the “actual process variation”) which is a constant horizontal line, unless we change the control limits. It is nice to see how both variations (within subgroup and total) oscilate arround the “assumed variation” in a random way and independently from each other (showing, by the way, that the process is stable and that the control limits were well deffined).
In the beginning I requested “pretty normal distribution”. This is because Rbar/d2 is an estimator of sigma for the normal distribution only. In stable processes that are not normally distributed you can see how the “within variation” and the “total variation” stabilize arround two different levels, with only the “within variation” oscilating arround the “assumed variation” (because for the control limits you also use Rbar/d2).

0
#80736

rams
Participant

Hemanth,
Now I am confused :-).
I thought Cpk computation is based on “within subgroup” variation because of R-bar.
Enlighten me.
rams

0
#80739

Bess Kitan
Participant

As per Six Sigma Ppk is representated as Long process capablity and Cpk is short term capabality. In that case Ppk is will always lesser than Cpk considering the long tern process variation of 1.5 sigma shift.
Hope it is clear now.

0
#80740

zhou
Participant

First of all,allow me to correct what Hemanth said
Cpk is min(UPL-CL),(CL-LPL)/6sigma within, not Tol/6sigmabetween

We can say that Pp<=Cp ALWAYS. Why?
Cp= tolerance/6 sigma within; Pp=tolerance/6sigma overall.
The difference between Cp and Pp is the difference of “sigma within” and ” sigma overall”.
Sigma overall square= sigma within square+ sigma between square.
Sigma will be >=zero always, agree with me?
so Sigma overall>=sigma within. Here we get the relation Pp<= Cp always.
When coming to Cpk and Ppk, PpkCpk. This is what i ever encountered in the real world. Sorry i can’t illustrate it statistically

Thanks
Lawrence

0
#80743

Hemanth
Participant

Hi,
I am sorry, yes you are right that we consider within subgroup variation while calculating control limits. I guess it was a long vacation for me and the hangover of it did this. However, the explanation holds same for cpk being greater than ppk. That while calculating cpk you are taking a part of overall variation in denominator whereas in ppk its the overall variation.
regards
hemanth

0
#80751

Gabriel
Participant

Lawrance:
Waht you said is true:
“Sigma overall>=sigma within. Here we get the relation Pp<= Cp always"
The problem is that you can not know neither the actual values of “sigma overall” and “sigma within” nor the actual values of Pp and Cp, only estimatros based on sampling.
Imagine that you have a stable process. In this case the actual values of “sigma overall” = “sigma within”. Let’s call this value just “sigma”.
In this case, Sqrt(Sum (X-Xbar)2 / (n-1)), i.e. the estimator of “sigma overall” will be an estimator of what we called “sigma”, and Rbar/d2, i.e. the estimator of “sigma within” will be another estimator of the same “sigma”. An estimator can be higher or lower than the actual value that it estimates. So both estimators can be, independently from each other, higher or lower than “sigma”. On average, about half of the times one of the estimators will be higher than the other. This is not just “statistical theory”. This happens on the real world.
Conclusion: if we are talking of actual (not knowledgeable) values, Pp=”sigma within”, with the “=” cases hapenning only in a stable process. If we are talking of the calculated values, the probability of Pp>>”sigma within”), and something in the middle if you have a “pretty stable” (or “a bit unstable”) process (where the actual “sigma overall” is just a bit higher than the actual “sigma within”).

0
Viewing 13 posts - 1 through 13 (of 13 total)

The forum ‘General’ is closed to new topics and replies.