iSixSigma

Any one could tell me when I should use CPK or PPK? THKS!

Six Sigma – iSixSigma Forums Old Forums General Any one could tell me when I should use CPK or PPK? THKS!

Viewing 93 posts - 1 through 93 (of 93 total)
  • Author
    Posts
  • #66871

    sundeep singh
    Member

    Cpk is for short term
    Ppk is for long term
    if we feel that short term data can solve our problem
    we can use Cpk otherwise Ppk

    0
    #66873

    Praneet
    Participant

    Hi Preston,
    I shall attempt to answer your query… Basically let us get the fundamentals clear.
    Cpk = Basically means “Process Capability INdex” an index of how good ur process is.

    Ppk = Process Performance Index, this index basically tries to verify if the sample that you have generated from the process is capable to meet Customer CTQ’s (read requirements)It differs from Process Capability in that Process Performance only applies to a specific batch of material. Samples from the batch may need to be quite large to be representative of the variation in the batch. Process Performance is only used when process control cannot be evaluated. An example of this is for a short pre-production run.
    Process Performance generally uses sample sigma in its calculation; Process capability uses the process sigma value determined from either the Moving Range , Range , or Sigma control charts.

    The calculated Process Performance and Process Capability indices will likely be quite similar when the process is in statistical control.

    One more suggestion , instead of manually computing the Cpk & PPk figure try using Minitab , a statistical Software for your computations.
    WHen u enter the data for the process capability study the sotware generates a Graph which computes both the Cpk as well as Ppk ..
    Hope I have helped you in clearing your doubts about Cpk & Ppk
    Regards
    Praneet

    0
    #66885

    “Ken”
    Participant

    Preston,

    Consider if the data are in time order that Cpk would be the estimate of choice. If the data are not in time order, or are randomly distributed than Ppk is your alternate estimate of choice.

    If the data are in time order then both estimates can be made. While there is debate over the value of one versus the other, the general line of thought is that Ppk gets closer to what the customer ultimately sees as the output from your process. You should always try to look at both estimates, if possible, before making an official report. If Ppk varies by more than about 20% from Cpk, then insure the data are statistically stable. If the data are not stable, then it’s best not reporting these estimates until you can identify and remove the sources of special cause variation.

    Ken

    0
    #27362

    preston
    Participant

    As we all know, CPK means “production capacity”, and PPK means “production performance”, What I want to know is if there is any difference of application condition between them and if we could use both of them in the same time though they have different caculating formula .

    0
    #66895

    Michael Whaley
    Participant

    Preston:

    Ppk produces an index number (like 1.33) for the process variation. Cpk references the variation to your specification limits. If you just want to know how much variation the process exhibits, a Ppk measurement is fine. If you want to know how that variation will affect the ability of your process to meet customer requirements (CTQ’s), you should use Cpk. As a practical consideration, the Ppk study requires less work. In Six Sigma, I prefer to do several Ppk studies to identify the points in the process with the highest variation. After attacking those processes, I’ll do a Cpk study to confirm the process’ ability to consistently meet (or exceed) the customer’s requirements.

    Hope this helps,

    Michael

    0
    #66902

    Eoin
    Participant

    Interesting discussion. I find that the similarity differences between PpK and Cpk can give a lot of insight into a process and how to approach improving it.

    Cpk is calcualted using an estimate of the standard deviation caluculated using R-bar/d2.
    PpK uses the usual form of the standard deviation ie the root of the varience or the square root of the sum of squares divided by n-1.

    The R-bar/D2 estimation of the standard deviation has a smoothing effect and the Cpk statistic is less sensitive to points which are further away from the mean than is Ppk.

    As the Ppk and Cpk converge the reliability of the process increases. This has implications for the effectiveness of you experimentation.

    It could be argued that the use of Ppk and Cpk (with sufficient sample size) are far more valid estimates of long and short term capability of processes since the 1.5 sigma shift has a shaky statistical foundation.

    I prefer to use Ppk when speaking to our internal customers when determining the performance of the process and when setting goals and metrics.

    Hope this is useful.

    Best wishes,
    Eoin

    0
    #66907

    Jim Parnella
    Participant

    Preston,
    To add to the information already posted:
    Cpk is calculated using RBar/d2 or SBar/c4 for Sigma in the denominator of you equation. This calculation for Sigma REQUIRES the process to be in a state of statistical control. If not in control, your calculation of Sigma (and hence Cpk) is useless – it is only valid when in-control. Cpk is called the “process CAPABILITY index”. Think of it this way … Cpk tells you what the process is CAPABLE of doing in future, assuming it remains in a state of statistical control.

    Ppk uses the sample standard deviation in the denominator. It does not require statistical control to be valid. Ppk is called the “process PERFORMANCE index”. Think of it this way … Ppk tells you how the process has performed in the past. You cannot use it predict the future, like with Cpk, because the process is not in a state of control.

    Think of it a different way: You can use Cpk as a measure of PAST performance and FUTURE capability (assuming statistical control is maintained). Ppk however can only be used as a measure of PAST performance.

    Also, like stated previously, the values for Cpk and Ppk will converge to almost the same value when the process is in statistical control. that is because Sigma and the sample standard deviation will be identical (at least as can be distinguished by an F-test). When out of control, the values will be distinctly different, perhaps by a very wide margin.

    I hope this helps you decide which to use.

    Jim

    0
    #66911

    ed rosa
    Participant

    I’ve always reported PPk:
    1. in the beginning of a long range production of a
    part so I could capture the assignable and non
    assignable causes.

    Cpk – process in control – is calculated and reported
    2. After finding the “root causes” and eliminating
    them from the process.

    Absolutely important:
    3. Now let’s assume that there are safety or critical
    dimensions or parameters in the part, and the process
    still produces “stragglers”, which are detected and
    removed from the system – and not included in the chart
    -so the process charts show “in control”. Therefore a
    reasonable Cpk is reported. The customer will sooner or
    later find out that this is happening, will be alarmed
    that your process is not robust, and will require that
    a PPk, which includes the stragglers, be reported, so
    the actual extent of a potential problem can be
    quantified. Avoid this situation at all cost !!

    0
    #66935

    SAMIR MISTRY
    Member

    PPK ANALYSIS IS MUST TO UNDERSTAND THE PROCESS BEHAVIOUR AND HENCE ITS PERFORMANCE. THE PROCESS SHOULD REMAIN UNTOUCHED WHILE DOIND THE STUDY. THE BEHAVIOUR THEN SHOULD BE STUDIED TO REMOVE ALL ASSIGNABLE CAUSES TILL THE PPK VALUE IS ACHEIVED >= TO 1.67. AFTER ACHEIVING PPK 1.67 THE PROCESS BEHAVIOUR IS KNOWN BY KILLING ALL ASSIGNABLE CAUSES. USE THE CALCULATED CONTROL LIMITS IN PPK CHART FOR ON-GOING PROCESS CAPABILITY (CPK). THEN IT IS A MONITORING TOOL, WHETHER THE PROCESS OPERATES IN GIVEN CONTROL LIMITS WHICH ARE REALISTIC &STATISTICALLY CALCULATED

    0
    #66970

    Kim Niles
    Participant

    Montgomery states in his book: Montgomery, Douglas. C. “Introduction to Statistical Quality Control”. Wiley & Sons, Inc. New York. 2001. 4th ed. Pg 372, that in 1991 the Automotive Industry Action Group (AIAG) was formed with one of their objectives being to standardize industry reporting requirements. He says that they recommend Cpk when the process is in control and Ppk when it isn’t. Montgomery goes on to get really personal and emotional about this which is unique to this page of this book and other books I have of his. He thinks Ppk is baloney as he states “Ppk is actually more than a step backwards. They are a waste of engineering and management effort – they tell you nothing”.

    While Montgomery gets frustrated over the use of Ppk, he does a poor job of explaining what a stable process is. I respect his works just the same as he alone has even attempted to try to explain the difference between a stable process and a non-stable one.

    I plan to continue this debate under a different heading (see: What is a stable process?).

    I hope this helps.
    Sincerely,
    KN – http://www.znet.com/~sdsampe/kimn.htm

    0
    #67497

    Chantal
    Participant

    Preston,
    In fact there are 4 such measures : Cp, Cpk, Pp & Ppk. The first 2 are for computing the index with respect to the subgrouping of your data (different shifts, machines, operators, etc.), while the last 2 are for the whole process (no subgrouping). For both Ppk & Cpk the “k” stands for “centralizing facteur”- it assumes the index takes into consideration the fact that your data is maybe not centered (and hence, your index shall be smaller).
    It is more realistic to use Pp & Ppk than Cp or Cpk as the process variation cannot be tempered with by inappropirate subgrouping. However, Cp and Cpk can be very useful in order to know if, under the best conditions, the process is capable of fitting into the specs or not.It basically gives you the best case scenario for the existing process.
    I hope this helped you a bit. If you have anymore questions, don’t hesitate to contact me.
    Chantal
     

    0
    #78282

    Binh Nguyen
    Participant

    Hi Chanta,
    Why I see in the spec requirement for some tests is Ppk>1.66 or Cpk> 1.33.  For calculation of Cpk and Ppk, what is the diffrence in formula? or they are the same except for the quanity of sample?
    Another questions, Could I use DE histogram to calculate Ppk ?
    or only for Cpk?
    Thank you in advance.
    B. Nguyen
     

    0
    #78287

    Quainoo
    Member

    Hi Binh
    The ppk and cpk is similar, the only difference is the method we calculate the sigma for them.
    Normally we just use cpk with the formula we practiced “min(mean – LSL, USL – mean) / 3 sigma”.
    Only if you have man sample data from small groups, you saw ppk and cpk together.
    For sigma in cpk, it is calcuate by “R-bar / d2)
    For sigma in ppk, it is calculate from all individual data.
    Hope this helpful.
     
    Vincent, China
     
     

    0
    #78382

    Binh Nguyen
    Participant

    Thank Vincent,
    Another question: How could I choose the sample size?
    If I have 25 wafers in 1 lot and each wafers have 2000 die.
    Could I calculate Ppk from a sample of 2000 units built from 1 wafer to present the lot?  How big is the size should be to get the Ppk?
    Best Rgds.
    B. Nguyen 

    0
    #86669

    E Hargrove
    Participant

    Under “ideal” conditions the result of Cpk and Ppk should be the same.  Ideal conditions being:  normal distribution, sufficient data set size, non-autocorrelated data, data in statistical control.
    Strictly speaking the use of moving range is an inferential for standard deviation.  Anything that would cause moving range / d2 to deviate from the standard deviation would, again strictly speaking, invalidate the use of moving range and therefore Ppk.  However, the characteristic that Ppk is less sensitive to these deviations does provide some pragmatic uses of Ppk whether it’s theoretically “valid” as an inferential for Cpk or not.  These results would only have merit, or use, if the differences were compared against other Ppk results.  In other words, if repeatability or trend improvement is the primary concern then Ppk is quite adequate, but if statistical accuracy is required Cpk will be the most “defendable.”
    Ppk is less sensitive to deviations from these ideal conditions because it is based on a linear equation whereas Cpk is based on a second-order equation.
    The choice of which to use depends on how sensitive you wish to be to these non-ideal conditions occurring in your data.  If you prefer stable results where long-term trends are of more interest then Ppk is preferable.  If you prefer quick notification of variation then Cpk may be preferable.  If you choose Cpk and you have frequent incursions of variability then your Cpk may involve a large amount of noise or variation within itself.
    In control applications of the chemical, refining, and pulp & paper industries, I prefer to use Ppk since it is more stable with respect to the multiple sources of variability to the process and the usually lengthy amount of time to put in place a corrective action.  Particulary relevant is that infrequent, but “sharp” spikes in data (common in many control situations) will translate into very large changes in Cpk whereas Ppk will mitigate these to, what I consider, a more meaningful qualitative assessment of the control.
    It may help to put forth some effects of non-ideal conditions:
    Non-normal distributions: for a normal distribution, Cpk=1.0 implies that 0.27% of the product will be off-spec.  For a non-normal distribution, Cpk=1.0 implies that the off-spec product will be equal to the sum of the %below 3s and %above 3s.  While the tails of the non-normal distribution “could” sum to 0.27%, it is unlikely.  Ppk uses the value of d2 for it’s calculation. d2 is only valid for normal distributions.  So strictly speaking, Ppk should not be used for non-normal distributions, but again there are useful applications for comparing Ppk values amongst themselves as long as you understand that you have departed from using Ppk to infer Cpk.
    Having an insufficient size of data will impact both values adversely, but I believe the influence of the square root in the Cpk calculation for standard deviation would be more likely to require a larger data set to settle on a stable value.
    Autocorrelated data.  As long as the data set is large enough, autocorrelation should not significantly impact Cpk.  However, autocorrelated data will have a severe impact on Ppk.  If the data is autocorrelated the moving ranges will tend to be smaller thus understating the implied standard deviation and overstating the respective implied Cpk.
    Data not in statistical control invalidates both Cpk and Ppk and neither result should be considered reliable.
    Hope some of this helps.  Any comments are more than welcome.
     

    0
    #86671

    Gabriel
    Participant

    Some comments:
    We seem to have inverted definitions of Cp and Pp. For me (and for most people in this forum) Cp uses Rbar/d2 and Pp uses the sqrt. (Ref: AIAG’s SPC manual). If swap Cpk for Ppk in your text, most thing look fine to me.
    The estimation of sigma using the sqrt is pretty more efficient than using Rbar/d2. That mean that to have the same confidence interval width in the estimation you will requier a larger sample size when using Rbar/d2. If I understood correctly, this is the opposite from what you said.
    You said “Non-normal distributions: for a normal distribution, Cpk=1.0 implies that 0.27% of the product will be off-spec.  For a non-normal distribution, Cpk=1.0 implies that the off-spec product will be equal to the sum of the %below 3s and %above 3s” This is correct only if Cp=1.0 too. For example, if Cp=10.0 and Cpk=1.0 you will have only 0.135% off-spec if normal. The other specification limit would be too far from the distribution as to have a part of the tail beyond it.

    0
    #86686

    Hemanth
    Participant

    Hi PrestonHmmm.. a VERY INTERESTING discussion. I got many new things to learn here..BUT if I were you I would go back and check with some AUTHENTIC literature on capability (I strongly recommend that). One such book is by Montegomery, which is mentioned in one of the messages..
    Now, Its my turn to confuse (I am sorry for that..), but as I understand Ppk index is used when you are validating a newly developed process, something which you do by conducting PPAP (refer to a QS9000 manual)Cpk index is used when you have a continuous production process and you wish to know if your process is still capable of meeting your specs. I would suggest if you are conducting a PPAP then use Ppk index and when you have a continuous process use a Cpk index for regular monitoring and dont monitor them simultaneously, it will only confuse you.

    0
    #93599

    Chandrasekar
    Participant

    Hello sir,
                     Cpk is the Long term study.PpK is the Short term study.
    To take 25 nos to 50 nos=PpK
    To take one month of data=CpK (Please Correct your message)
    Hello sir only one doubt: Why perticularly 6 uses the Pp value formula?
    Thanks&Regards
    Chandrasekar 

    0
    #93610

    Ron
    Member

    Use Cpk always!
     
    Cpk tells you what is happening NOW!
     
    Ppk only give you a history lesson in a dynamic ever changing process data from a year ago is of little value.

    0
    #93620

    Gabriel
    Participant

    Nonsense

    0
    #93622

    Sinnicks
    Participant

    Ron,
     
    Sigma long-term is what the customer ultimately sees (ppk), so I wouldn’t necessarily throw it out.  cpk, while very useful, is a snapshot in time of your process, and is used ONLY for the case where the process is in a state of statistical process control (all variation can only be explained by common causes). 
    Mark

    0
    #93663

    Rick Pastor
    Member

    Kim:
     I am not sure why Montgomery would say that Ppk is a step backward.  I have several of his books but not the one in question.  I would love to know how Montgomery defines Cpk and Ppk and to see the section in question.   
    The fact is, if you calculate Cpk and Ppk using the same set of data, the two values should be very close.  The difference in the two lies in how you calculate the standard deviation.  In what I call the old days, before everyone had computers, the standard deviation was calculated using a short cut — R_bar/d2.  With computers there is no need to use a short cut to calculate the standard deviations.
    Short‑Term vs Long‑Term Capability:  Here is where, I think, the confusion even gets worse. Since the same set of data gives nearly the same value for Cpk and Ppk; we have to ask what does short‑term and long‑term capability really mean. 
    Short‑Term:  Examines the data over a short period of time.  For example, last week’s data or for a high volume process it maybe the data collected for a single day.
    Long‑Term:  Examines the data over a longer period of time, say a month or a year. 
    The longer-term capability takes into account the drift in the mean. In contrast, the short term is a snap shot of the process.  Clearly, a plot of short‑term capability as a function of time shows how the capability changed over time.  The long‑term capability summarizes the change.  It is how you use the data to improve the process that is really important.

    0
    #93695

    Gabriel
    Participant

    No. The difference between the 2 methods of estimating the standard deviation is that Rbar/d2 is insensible to shifts in the mean beacues it is based on the ranges alone.
    You said: “The fact is, if you calculate Cpk and Ppk using the same set of data, the two values should be very close.” This is true only if there are no shifts in the mean. Have a look:
    Dataset #1

    Rbar/d2

    10.38
    10.26
    11.36
    9.80
    11.13
    9.22
    11.19
    9.40
    9.14
    8.70

    1.15

    9.34
    11.18
    11.93
    9.00
    10.80
    10.29
    10.31
    8.90
    9.79
    9.66

    9.82
    9.42
    8.01
    10.55
    10.41
    9.60
    10.96
    9.32
    9.07
    8.02

    S

    10.00
    10.60
    11.54
    10.90
    12.86
    11.44
    7.77
    10.42
    12.21
    10.94

    1.11

    10.75
    11.56
    9.90
    9.51
    9.23
    12.05
    11.01
    9.69
    9.78
    9.40

    UCL(Xbar)
    11.72

    10.60
    10.55

    10.88
    10.52
    10.25

    Xbarbar
    10.17

    10.06

    9.95

    9.55
    10.00
    9.34

    LCL(Xbar)
    8.62

    UCL(R)
    5.75

    3.92

    3.63
    2.84
    3.42

    3.14
    2.92

    Rbar
    2.68

    1.41
    2.13

    1.89

    1.52

    LCL(R)
    0.00

    Dataset #2

    Rbar/d2

    9.38
    9.26
    12.36
    8.80
    11.63
    10.22
    12.19
    8.40
    9.14
    8.20

    1.15

    8.34
    10.18
    12.93
    8.00
    11.30
    11.29
    11.31
    7.90
    9.79
    9.16

    8.82
    8.42
    9.01
    9.55
    10.91
    10.60
    11.96
    8.32
    9.07
    7.52

    S

    9.00
    9.60
    12.54
    9.90
    13.36
    12.44
    8.77
    9.42
    12.21
    10.44

    1.56

    9.75
    10.56
    10.90
    8.51
    9.73
    13.05
    12.01
    8.69
    9.78
    8.90

    UCL(Xbar)
    11.62

    11.55

    11.38
    11.52
    11.25

    Xbarbar
    10.07

    9.06
    9.60

    8.95

    8.55
    10.00
    8.84

    LCL(Xbar)
    8.52

    UCL(R)
    5.75

    3.92

    3.63
    2.84
    3.42

    3.14
    2.92

    Rbar
    2.68

    1.41
    2.13

    1.89

    1.52

    LCL(R)
    0.00

    Each dataset has 10 subgroups of size 5 (total 50 data). Below the individual values you can see the values of Xbar togetehr with Xbarbar and the control limits. And below the Xbar data you can see the values of R together with Rbar and the control limits. Note that the Xbarbar and the control limits for Xbar of both sets are about the same

    0
    #93706

    Gabriel
    Participant

    Somehow the last paragrhaph was truncated when posting… Here is the full version:
    Each dataset has 10 subgroups of size 5 (total 50 data). Below the individual values you can see the values of Xbar togetehr with Xbarbar and the control limits. And below the Xbar data you can see the values of R together with Rbar and the control limits. Note that the Xbarbar and the control limits for Xbar of both sets are about the same. The values of R are identical in both sets, so Rbar and the cotnrol limits for R are also identical. In the set #1, the values of Rbar/d2 and S (the sample standard dviation of the 50 individual values all together) are very close. Not only that, S is even slightly smaller than Rbar/d2 (something that many people think and say tha it’s impossible). In the set #2, Rbar/d2 is identical to the same value in the set#1 (it must be, Rbar was identical), but S is 36% larger. Finally, note that all values of Xbar and R are within control limits in both sets.
    So your statement “if you calculate Cpk and Ppk using the same set of data, the two values should be very close” is true in one case, but not in the other. Do you know why?

    0
    #93721

    Statman
    Member

    Hi Gabriel,
     
    Is it because process B has a 1.5 sigma mean shift? J
     
    Just to add to your point.  The variation calculated within the subgroups for Cpk is an estimate of the inherent, or entitlement level of variation.  The variation calculated as the RMS of all the data is an estimate of the total variation including any temporal, cyclical, or transient special cause variation.  There is risk in using the Ppk as a prediction of future process performance, particularly if the special causes are primarily of the random transient variety.  Even though the Ppk will take in to account the drifts or shifts over the period of data collection, there is no way to predict that the pattern of special causes will exist in the future.  I think this is why Montgomery gets frustrated over the use of Ppk.
     
    One good use of these metrics is in components of variation context.  If I perform a COV on your data:
     
    Process A:
    Variance Components
     
    Source   Var Comp.   % of Total  StDev
    Between     -0.014*        0.00  0.000
    Within       1.256       100.00  1.121
    Total        1.256               1.121
     
    Process B :
    Variance Components
     
                         % of
    Source   Var Comp.  Total  StDev
    Between      1.274  50.37  1.129
    Within       1.256  49.63  1.121
    Total        2.530         1.591
     
    You can see that Process A is performing at it’s level of entitlement.  Where process B sees a 50.37% of it’s total variation contribution from process management (for lack of a better term) with the same potential performance level (entitlement) as process A.
     
    The proper use of Ppk should be to assess the amount of improvement that can be made through DMAIC projects vs the need for process re-design.  Makes more since than adding a 1.5 shift to process A.
     
    Cheers,
     
    Statman

    0
    #93735

    Rick Pastor
    Member

    You, Gabriel, have provided two excellent sets of data.  The two sets of data show why estimating sigma from R_bar/d2 is so dangerous (d2=2.326 for a subgroup of 5).  Estimating s from R_bar/d2 assumes that the distribution is normal.  The following will show that it is better to use the definition of standard deviation, the formula, to calculate s. 
     
    Before I go through what I have seen in the data, let me state that I used both JMP and excel for my calculations.  Furthermore, both data sets were examined in two ways:  A distribution analysis, and a control chart.  Furthermore, I make the arbitrary assumption that the USL and LSL equal 12.4 and 7.4 respectively.
     
    Data Set 1: The data is normally distributed and Cpk » Ppk.
    Distribution analysis: 
    1.  10 subgroup averages: Goodness of fit p-value = 0.8061 (JMP)
    2.      50 individual points:  Goodness of fit p-value = 0.8598 (JMP)
    3.      Summary:  The data is normally distributed.
    Control Chart:
    4.      Subgroup size of 5 and applied Shewart’s eight (8) tests:  The data indicates that the process is in control.  The data passed all eight tests. (JMP)
    Excel Calculations:
    5.      Cpk= 0.644 (Excel)
    6.      Ppk= 0.668 (Excel) (agrees with the value calculated in JMP)
    7.      Summary: Cpk » Ppk
     
    Data Set 2: The data is “not” normally distributed and Cpk > Ppk.
    Distribution analysis: 
    1.  10 subgroup averages: Goodness of fit p-value = 0.0601 (JMP)
    2.      50 individual points:  Goodness of fit p-value = 0.0098 (JMP)
    3.      Summary: The data is not normally distributed
    Control Chart:
    4.      Subgroup size of 5 and applied Shewart’s eight (8) tests:  The data indicates a process that is out of control.  (Two out of three points in a row in Zone A or beyond detects a shift in the process average or increase in the standard deviation. Any two out of three points provide a positive test.) (JMP)
    Excel Calculations:
    5. Cpk= 0.673 (Excel)
    6. Ppk= 0.499 (Excel) (agrees with the value calculated in JMP)
    7. Summary: the process in not stable Cpk > Ppk
     
    When a process in not stable, the data is not normally distributed, the value of Cpk can be larger than Ppk.  What this means is the estimate of s does not pick up the shift in process mean.  In contrast, the calculated value of s captures the variability of process and gives a smaller value for Ppk. 
     
    Gabriel, I stand corrected for failing to reveal the assumptions to my statement.  Cpk and Ppk are roughly equal when the same set of data is used and the data is normally distributed.  In other words, a process that is stable, normally distributed data, gives Cpk and Ppk that are roughly equal.  Therefore, Cpk for a process that is not stable can make the process appear better than it actually is.  That is why reported values of Cpk or Ppk should include the calculation method. 
     
    Because many managers and engineers are barely familiar with Cpk and even less familiar with Ppk, I calculate Ppk while I use the word Cpk in discussions. 
     
     
     

    0
    #93757

    Rick Pastor
    Member

    Statman:
    I confess that I am a litte confused by the statement that it is better than adding a 1.5sigma shift to process A.  Nevertheless, I would say that Montgomery and I would have a different point of frustration (I would love to see the exact pages where Montgomery discuss the issue).  When I examine the difference between Cpk and Ppk, I just see that Cpk uses a short cut to calculate sigma.  That short cut requires the data to be normally distributed. 

    0
    #93773

    Statman
    Member

    Hi Rick,
     
    The 1.5 comment was my feeble attempt at sarcasm.
     
    I have not read the Montgomery’s discussion of the issue.  I am only assuming based on my experience why he would have an issue with how Pp can be misused.
     
    I understand where you are coming from on the issue of “normality” with Gabriel’s distribution B.  However, I think that there needs to be clarity in that you are talking about the shape of the histogram of all the data points, not that the data is normal.  If you subtract Gabriel’s process A data from his process B data you will find that Gabriel has simply taken a random normal data series (Process A) and added constants to the subgroups.  So the underlying distribution in each process is normal but process B is not stable due to shifting the subgroups in various directions to various degrees.  Therefore, Process A is a random normal and stable process where process B is a process that is inherently normal but not stable.  In other words, process B has subgroups with normally distributed data but the averages of the subgroups do not differ due to only random normal variation.
     
    There is a risk in using the histogram of all the data and subjecting it to a normality test.  A process that is not stable can have a normal shape in the histogram of all the data.  A process that is stable can have a non-normal shape in the histogram.   This is because you can have any of the four situations: Stable/Normal, Stable/Non-normal, Not stable/normal, Not stable/non-normal.
     
    You are right that there is a risk of prediction future capability of a process based solely on the Cpk as the Cpk assumes a stable process.  However, both Cpk and Pp assume normality when used to predict the non-conformance rate relative to the specs.  There is no advantage (from a normality perspective) in one metric over the other.
     
    As I showed in my last post, my preference is to use a COV approach to establish the percent contribution to the process due to process management and to get an estimate of the entitlement level of capability.  If, as in the case of Gabriel’s data, the process can be assumed inherently normal, then the Cp will indicate the best possible performance of the process and the predicted non-conformance rate at that level. 
     
    Statman

    0
    #93778

    Statman Too
    Member

    Based on the short-term standard deviation, we compute
    Cp and Cpk. Using the long-term standard deviation, we
    compute Pp and Ppk. Using Rbar/d2 is a rather pointless
    exercise given the proliferation of computers. Besides,
    one should use d2* from the MilStds and not d2, for
    reasons one can easily source trace through the
    literature. Rather than recapitulate what has already
    been written, I am providing the following from the Dr.
    Harry Q&A forum: “The capability of a process has two
    distinct but interrelated dimensions. First, there is “short-
    term capability,” or simply Z.st. Second, we have the
    dimension “long-term capability,” or just Z.lt. The short-
    term (instantaneous) form of Z is given as Z.st = |SL – T| /
    S.st, where SL is the specification limit, T is the nominal
    specification and S.st is the short-term standard deviation.
    The short-term standard deviation would be computed as
    S.st = sqrt[SS.w / g(n – 1)], where SS.w is the sums-of-
    squares due to variation occurring within subgroups, g is
    the number of subgroups, and n is the number of
    observations within a subgroup. It should be fairly
    apparent that Z.st assesses the ability of a process to
    repeat (or otherwise replicate) any given performance
    condition, at any arbitrary moment in time. Owing to the
    merits of a rational sampling strategy and given that SS.w
    captures only momentary influences of a transient and
    random nature, we are compelled to recognize that Z.st is
    a measure of “instantaneous reproducibility.” In other
    words, the sampling strategy must be designed such that
    Z.st does not capture or otherwise reflect temporal
    influences (time related sources of error). The metric Z.st
    must echo only pure error (random influences). Now
    considering Z.lt, we understand that this metric is
    intended to expose how well the process can replicate a
    given performance condition over many cycles of the
    process. In its purest form, Z.lt is intended to capture and
    “pool” all of the observed instantaneous effects as well as
    the longitudinal influences. Thus, we compute Z.lt = |SL –
    M| / S.lt, where SL is the specification limit, M is the mean
    (average) and S.lt is the long-term standard deviation.
    The long-term standard deviation is given as S.lt =
    sqrt[SS.t / (ng – 1)], where SS.t is the total sums-of-
    squares. In this context, SS.t captures two sources of
    variation – errors that occur within subgroups (SS.w) as
    well as those that are created between subgroups (SS.b).
    Given the absence of covariance, we are able to compute
    the quantity SS.t = SS.b + SS.w. In this context, we see
    that Z.lt provides a global sense of capability, not just a
    “slice in time” snapshot. Consequently, we recognize that
    Z.lt is time-sensitive, whereas Z.st is relatively
    independent of time.” From Dr. Harry’s definition, we
    have Cp = |SL – T| / S.st, Cpk = Cp*(1-(|M-T|/|SL-T|), Pp = |
    SL – T| / S.st, and Ppk= Pp*(1-(|M-T|/|SL-T|). From these
    equations, it would seem that Cp and Cpk are short-term
    measures of capability, while Pp and Ppk and long-term
    measures. These same statistics and indices of capability
    are embedded in the Minitab software under the tool
    header “Six Sigma” and spelled out in the help menu. By
    what I can see, Cpk captures “static but momentary” shifts
    of a transient nature, while “dynamic shifts and drifts” is
    captured within the Ppk calculation.
    Hope this helps
    Statman Too

    0
    #93802

    Gabriel
    Participant

    Rick:
    “When a process in not stable, the data is not normally distributed”
    That’s wrong again. I’ve prepared the following dataset:
    Dataset #3

    Rbar/d2

    10.83
    11.13
    10.00
    8.38
    11.08
    8.56
    10.52
    10.16
    9.11
    5.66

    1.15

    9.79
    12.05
    10.57
    7.58
    10.75
    9.63
    9.64
    9.66
    9.76
    6.62

    10.27
    10.29
    6.65
    9.13
    10.36
    8.94
    10.29
    10.08
    9.04
    4.98

    S

    10.45
    11.47
    10.18
    9.48
    12.81
    10.78
    7.10
    11.18
    12.18
    7.90

    1.71

    11.20
    12.43
    8.54
    8.09
    9.18
    11.39
    10.34
    10.45
    9.75
    6.36

    11.48

    UCL(Xbar)
    11.20

    10.51

    10.83
    9.86

    10.31
    9.97

    Xbarbar
    9.66

    9.19
    8.53

    9.58

    LCL(Xbar)
    8.11

    6.30

    UCL(R)
    5.75

    3.92

    3.63
    2.84
    3.42

    3.14
    2.92

    Rbar
    2.68

    1.41
    2.13

    1.89

    1.52

    LCL(R)
    0

    Please make exactly the same analysis you performed for the datasets #1 and #2 and tell me what you find.
    The dangerous thing is not to use one indicator or the other. The dangerous thing is not knowing what the indicator you are using means.
    The lack of normality might be a cause of some minor difference between Rbar/d2 and S, because d2 is a constant defined for the normal distribution. However, there is something much more important in the concepts of Rbar/d2 and S that make S to be greater than Rbar/d2 regardless the shape of the distribution, as this example shows (the distribution of the set #3 has a shape that is pretty close to that of the set #1). Do you know why?
     

    0
    #93948

    Rick Pastor
    Member

    I have made the same calculation for all three sets of data:  The results are below:  In JMP, the distributions (all 50 points) for sets 1, 2, and 3 appear different. 
     
    Now, I have a question for you.  What rules do you use to determine if a set of data represents a stable proces?
     
     
    Data Set 1: The data is normally distributed and Cpk » Ppk.
    Distribution analysis: 
    1.  10 subgroup averages: Goodness of fit p-value = 0.8061 (JMP)
    2.      50 individual points:  Goodness of fit p-value = 0.8598 (JMP)
    3.      Summary:  The data is normally distributed.
    Control Chart:
    4.      Subgroup size of 5 and applied Shewart’s eight (8) tests:  The data indicates that the process is in control.  The data passed all eight tests. (JMP)
    Excel Calculations:
    5.      Cpk= 0.644 (Excel)
    6.      Ppk= 0.668 (Excel) (agrees with the value calculated in JMP)
    7.      Summary: Cpk » Ppk
     
    Data Set 2: The data is “not” normally distributed and Cpk > Ppk.
    Distribution analysis: 
    1.  10 subgroup averages: Goodness of fit p-value = 0.0601 (JMP)
    2.      50 individual points:  Goodness of fit p-value = 0.0098 (JMP)
    3.      Summary: The data is not normally distributed
    Control Chart:
    4.      Subgroup size of 5 and applied Shewart’s eight (8) tests:  The data indicates a process that is out of control.  (Two out of three points in a row in Zone A or beyond detects a shift in the process average or increase in the standard deviation. Any two out of three points provide a positive test.) (JMP)
    Excel Calculations:
    5. Cpk= 0.673 (Excel)
    6. Ppk= 0.499 (Excel) (agrees with the value calculated in JMP)
    7. Summary: the process in not stable Cpk > Ppk
     
    Data Set 3: The data is not normally distributed and Cpk > Ppk.
    Distribution analysis: 
    1.  10 subgroup averages: Goodness of fit p-value = 0.1890 (JMP)
    2.      50 individual points:  Goodness of fit p-value = 0.0492 (JMP)
    3.      Summary:  The data is normally distributed when viewed as subgroups (the central limit theorem).  On the other hand, the 50 point distributions indicates that the data is not normally distributed.
    Control Chart:
    4.      Subgroup size of 5 and applied Shewart’s eight (8) tests:  The data indicates that the process is not control.  Subgroup 1 and 10 (One point beyond Zone A detects a shift in the mean, an increase in the standard deviation, or a single aberration in the process. For interpreting Test 1, the R chart can be used to rule out increases in variation. ) (JMP)
    Excel Calculations:
    5.      Cpk= 0.711 (Excel)
    6.      Ppk= 0.432 (Excel) (agrees with the value calculated in JMP)
    7.      Summary: Cpk > Ppk
     

    0
    #93958

    Gabriel
    Participant

    Hi Rick,
    Ok, my 3rd example have failed.
    About your question to me, I use the WECo rules.
    But what I wanted to point out was not about how stability was determined or if the datasets of my examples were stable or not stable. Here is the story:
    You said:
    “The fact is, if you calculate Cpk and Ppk using the same set of data, the two values should be very close.  The difference in the two lies in how you calculate the standard deviation.  In what I call the old days, before everyone had computers, the standard deviation was calculated using a short cut — R_bar/d2.  With computers there is no need to use a short cut to calculate the standard deviations.”
    This is all wrong, from my point of view.
    The two first examples shown that, even when coming from the same dataset, Cpk and Ppk (or Rbar/d2 and S) can be not “very close”, but pretty far. At the end of these examples I asked if you knew why those values were close in one xample and not in the other. In your answer, you said things such as:
    “The two sets of data show why estimating sigma from R_bar/d2 is so dangerous (d2=2.326 for a subgroup of 5).  Estimating s from R_bar/d2 assumes that the distribution is normal” … “When a process in not stable, the data is not normally distributed, the value of Cpk can be larger than Ppk” … “Gabriel, I stand corrected for failing to reveal the assumptions to my statement.  Cpk and Ppk are roughly equal when the same set of data is used and the data is normally distributed.”
    I think that many of these things are wrong too, and the other tgings are missleading given the context.
    So I provided a third example that was supposed to have different values of Cpk and Ppk (or Rbar/d2 and S) and still be normally distributed (succeded in thi first and failed in the second).
    So I said what is wrong for me, but not why and what is correct, and how I created the examples. Here it goes (if you want a summary, go to the end of the post).
    All 3 examples were created in Excel.
    The first was just a set of random values from a normal distribution (µ=10, sigma=1), which was arenged in subgroups of size 5. So it is fully random (i.e. no special causes) and,  pass a normality test or not, they do belong to a normal distribution (of courses the chances are that they would pass, as they did). But normality was not intended to be an issue in this example.
    The second example was exactly the same values and in the same possition than the example one, exept that I added a constant to each subgroup. For example, the first subgroup example 1 is 10.38, 9.34, 9.82, 10.00 and 10.75., while in the example 2 the same subgroup was 9.38, 8.34, 8.82, 9.00 and 9.75. As you can see, the constant added in this case is -1. The constants (mostly 1, 0 or -1) were added “by hand” to each subgroup trying to keep the average close to that of the first example, and keeping the points within control limits. I know that that this does not mean that the process is stable. In this case, the process is clearly not stable since in each subgroup there is one special cause of variation (the constant that I added). Again, I haven’t even thought of “normality” in this example.
    Why in the first example Rbar/d2 and S are so close but not in the second? Because, in the first example, the values show only random variation, but in the second example they show not random variation. Not because the first vales are normally distributed and the second values are not. It is true that the values d2 are derived for a normal distribution, but using Rbar/d2 in a not-normal distribution is a second order error.
    The key thing is that, in both examples, all ranges are identical (because adding a constant to all the values of one subgroup does not changes its range), and then Rbar/d2 is identical. Rbar/d2 is the estimation of the “within subgroup” variation (also called “internal”, “inherent”, “short term”), and the within subgroup variation is identical in both examples. There is allways “between subgroup variation” (i.e. even if the process was perfectly stable not all averages will be equal). In general, part of the “between subgroup” variation can be explained due the “within subgroup” variation (i.e. the subgroups average are not equal just by chance). In the first example, all the between subgroup variation falls in this field, since chance was the only factor involved. Then, estimation of the “total” variation S (that includes both within subgroup and between subgroup variations) is very close to the estimation of the “within subgroup” variation Rbar/d2. In the second example, however, the “between subgroup” variation was way beyoind what could be explained due the “within subgroup” variation, because there was more “between subgroup” variation “added by hand” by me, when I added a different constat to each subgroup. And that’s why, in the second example, the total variation S is much greater than the within subgroup variatin Rbar/d2. Normality or lack of normality is a second orther factor here.
    Shortly, when all (or most) variation is due to common causes only, Rbar/d2 and S should be pretty close. When a significant part of the total variation is due to special causes, Rbar/d2 will fall short of the total variation S.
    However, this does not make Rbar/d2 “dangerous”, or “useless”, or “old fashioned”. In fact, Rbar/d2 will tell you what you could get if you identified and eliminated the special causes of variation. If you need further improvement beyond Rbar/d2, that will mnot be achived by using the control chart to identify unstability, because the stable process will not be satisfactorily capable.
    After you blaming the lack of normality for the disagreement between Cpk and Ppk I made the third example. Again, it was exactly the same values and in the same position than in the example 1, but with a constant added to the values of aeach subgroup. The difference was that, in this case, the constant were not arbitrarily choosen by me. Instead, I took 10 random values from another normal distribution (this one with average 0 and SD I don’t remember) ans added each one of these values to the values of each subgroup. The idea was that the shift of the mean of the subgroups would be normally distributed. Well, seems it failed. But the concept is still there. A set of data can look normal when all values are put together in a histogram or in a normality test. But the process can still be unstable when you plot the data over time. And then Rbar/d2 will be smaller than S, even when the data pass a normality test. I guess that they were too few subgroups and the “special” variation was too large, leaving “holes” in the histogram that made it look not normal. But the distribution where these values were taken from IS normal (it is the sum of two normal distirbutions).
    Summary:
    – Rbar/d2 and S are two estimators of two different types of variation: Repsectively, “within subgroup” variation and “total variation”.
    – When there is significantly more “between subgroup” variation than what can be explained due to the “within subgtroup” variation, then the “total” variation and its estimator S will be clearly larger than the “within subgroup” variation and its estimator Rbar/d2. Such a process is unstable. If the process is stable (all variation is random) then the total variation equals the within subgroup variation, and their estimators are close.
    – The gap between these two types of variations gives you an idea of the magnitude of the variation due to special causes included in the process. Sch variation due to special cuases can sometimes pass undetected in the control chart, so it is important to caompare the two values. Rbar/d2 (or Cpk) is the best you can get from S (or Ppk) by removing the special causes of variation.
    – Stability and Normality are two different and independent concepts. A dataset may be stable or not and, independently, show stability or not. In fact, if a process is not normally distributed, it NEEDS to keep not normally distributed to be stable (stability = equality of behaviour over time)
    – It is not true that Rbar/d2 is used instead of S because it was easier to calculate before IT was here. As said before, they are estimators of different types (or dimensions) of the process variation.
    – Instead of Rbar/d2, one could use Sbar/c4 as an estimator of the within subgroup variation. Rbar/d2 is easier to calculate without the aid of specific software, but Sbar/c4 is more efficient. In this case, it can be said that Rbar/d2 is more popular because it was easier to calculate when such softwares were not available. However, it must be understood that the loss of efficiency of Rbar/d2 over Sbar/c4 is less than 10% for subgroups of size up to 10.

    0
    #93973

    Gastala
    Participant

    Hi
    Let me put up an actual case I came across recently, which is consistent with Gabriel’s third simulation listed below.
    It concerns a production line that fills bottles of detergent. Every hour the supervisor weighed a subgroup of four bottles for control charting. The variation within each subgroup was very small.
    However the bottle filling machine also had longer term fluctuations (over many hours) also conforming to a normal distribution. These arose from environmental changes, the pressure of the compressed air and so on. In any case the company didn’t think it worth trying to fix it.
    The supervisor was having difficulty using control charts. The limits were based on the subgroup range and so points were frequently exceeding the limits on the mean chart because of the long term fluctuations. This led to over-adjusting the process (this was apparent from the data).
    What were the options?
     

    0
    #93980

    Savage
    Participant

    Glen,
     
    There are a few options that I see. One is to not use a subgroup size of 4. Since the 4 readings have been shown to be very similar, the range will be small, thus the control limits small, thus many samples are out-of-control, thus process adjustments are made which leads to over-control. A sample size of 1 might provide enough good information to help you control, run, and improve your process.
    Another option: instead of taking 4 consecutive readings, space out the time between each of the 4 readings. (This will likely cause the range to increase, and thus the control limits will increase.) The chance of an out-of-control condition will be reduced. Process adjustments will be made, but only when it is shown to be out-of-control.
    Another option: Use Three-Way control charts. This is discussed in Donald Wheeler’s text, Understanding Statistical Process Control, Second edition.
    Another option: Take your samples, but don’t make any adjustments to the process. This won’t be popular to many readers of this forum, but you will discover something about the process. It sounds like there is a lot of over-control, based on signals (out-of-control conditions), that leads to greater problems.
    Matt

    0
    #95938

    rich barber
    Member

    I’m trying to calculate the our process capabilities. X bar, X double bar, Range, Average Range, Cp and cpk. If you could e-mail me, I could send you the data I have collected onto an excel spreadsheet.

    0
    #95957

    tottow
    Member

    Rich,
         There are other much more qualified people on this forum than me, but if you want me to I will try and help.
     
    [email protected]

    0
    #108246

    vpschroeder
    Member

    Hi Praneet
    What is your opinion on reporting cpk or ppk for chemical processes in which nominal values are maintained manually through chemical additions? We sample chemical baths twice daily and plot the results on an individuals chart.  Additions are made to the baths to bring them close to their nominal values.  The additions are calculated from a formula that takes into account the amount of product exposed to the chemicals.
    The internal argument centers around whether we can call this a “random process” in the first place, and then it gets complicated after that when we attempt to cite Cpk values derived from the data..
    what is your thought??
     

    0
    #108253

    migs
    Participant

    I have the same concern. The data points are not obviously independent.
    Things get more complicated with the frequency iming of measurement: Before chemical top-up, After chemical top-up, random ???
    Appreciate if anyone can provide his opinion.

    0
    #108259

    Anonymous
    Guest

    Migs,
    There are many process that have to be adjusted from run to run, such as ink mixing and electro-chemical processes. These process do not satisfy the requirements of Shewhart Charts, although trend charts can be useful.
    For electro-chemical processes, a more important concern is the ‘state’ of the process, such as normality, and activity, and the amount of ‘carry-over.’ Since mass spectroscopy systems and NIR systems are relatively inexpensive these days, these can be used to monitor the amount of contamination and the best way of doing this is to use multivariate data analysis to study the spectra (PCR.) Camo software provides chemometric software. (No assocation although I have attended their course in Norway.)
    A good approach was taken by a colleague who developed a VB software program to record all the monitored data and to act as a kind of ‘process supervisor.’ In other words, it examines the data and recommends what should be added and the amount. The software also provides ‘trend charts’ and other diagnostic tools. The software then recommends a corrective action and what other tests should be performed. This approach ‘transformed’ the process! Most work these days is to try to reduce variation across the load by changing the chemistry, jigs, etc.
    Andy

    0
    #116984

    Taufik Subagja
    Member

    What is Cpk Stands for ? ( Acronim of cpk )
    is there anyone know what is K stands for ?

    0
    #119306

    Don LaDue
    Participant

    I’m no expert and have forgotten quite a bit because I don’t get to use my statistics knowledge a whole lot, but I understand that the cpk acronym is taken from the Spanish words for “Process capability Index”, hence the “strange” (to english speakers) order of the letters, and the “k”.
    Does anybody else know for sure?

    0
    #119309

    Anonymous
    Guest

    k stands for “katayori” it means bias …
    Andy

    0
    #131902

    BOISROND
    Participant

    Hello Chantal,
    Do you undestand French,
    Si oui pouvez-vous me confirmer les formules de Cp et Cpk.
    Cp = capabilité exprimant la dispersion d’échantillon
    Cp = capabilité exprimant à la fois la dispersion et le centrage.
    Merci
    Pascal
     
     
     

    0
    #131918

    Eddie
    Participant

    When I translate French to English,
    Cpk indicates how capable the process to produce consistent and stable output that meet customer requirements and expectations when the process is not centered, but falls on or between specifications.Ppk does the same job but in long-term base.
    Hope it helps.
    Eddie

    0
    #131919

    Flying Sigma Circus
    Participant

    “but falls on or between specifications.”
    Not a true statement.  Research the cause of negative Cpk values.

    0
    #131965

    Eddie
    Participant

    My statement is true, but incomplete.  It should be:
    “but falls on, between specifications, or the distance away from the specifications.”  This definition gives meaning to both +ve and -ve Cpk.
    Agree?

    0
    #149600

    Mike Robinson
    Participant

    We use a chemical software program to maintain our baths. We perform periodic tests, daily, weekly, monthly, etc. to determine chemical adds/dilutes. The software allows us to calculate CPK values for each tank.
    Our issue is whether to make adds/dilutes to meet the chemical manufacturer’s min/max values or to make adds/dilutes to maintain a CPK =>1.33.
     

    0
    #155188

    manmath shankare
    Participant

    Dear all,
                    Mein diffrence betn cpk & ppk is that while considering assinable & comman cause calculated index is preliminary capablity index & only considering comman or chance or random cause calculated index is capablity index. it is being calculated when process is routine.
    Bothe sigma calculation forula is diffrence for ppk is sigma s and for cpk only sigma
    i tried to tell of my optimum level.
    Best Regards,
    Manmath.

    0
    #159643

    Robert Tipton
    Member

    CPk stands for …
    Capability Index
    “CP”  for Capability of the process
    Small “k” is used mathematically as an index number.
    Proper use is “CPk”, not Cpk, not cpk, not CPK.
    Method to quickly understand if the process is acceptable or not.
     

    0
    #159645

    The New MB
    Member

    Are  you  sure?

    0
    #159646

    Robert Butler
    Participant

      It’s my understanding that “pk” is supposed to stand for “Process” and K is for the Japanese word “katayori” which means deviation or offset.  There was a nice thread on this issue on the forum awhile back.  The lead post is the following:
    https://www.isixsigma.com/forum/showmessage.asp?messageID=47302

    0
    #159647

    Robert Tipton
    Member

    Yes, I am sure.
    PPk is an index (k) number for the “Process Potential (PP)” study.  
    Typically, this is accomplished via a 30 to 100 piece, short term capability study.  
    The most popular sample size is a 50 piece study, for Process Potential capability studies.
    The formula is really quite simple…  PPk  =  Z-Minimum divided by 3.   
     
     
    During the early stages of implementing the CPk concept, Ford’s acceptance criteria was +/- 3 sigma, or 99.73% good parts.    In general, Larry Sullivan (Ford) is given credit for inventing the CPk concepts.    Mr. Sullivan had some perception that the Z-Values and Z-Tables would be too complicated to widely introduce, along with statistics at Ford.   At that time a Cost of Living Index was being used at Ford.    This was a simple formula, if the index was higher than 1,employees got more money.    Larry thought about this simple concept of more than 1 being good.    So Larry conceived the idea of dividing the minimum
    Z-Value by 3.    Since the acceptance criteria at Ford was +/- 3 sigma.    If the worst case (Z-Minimum) was divided by 3, and the CPK value was greater than 1, all is good.    This could be simple to train people, less than 1 is bad and more than 1 is good.     As they say, the rest is history.    Since that time Ford has increased the acceptance criteria to +/- 4 sigma (99.994%), so the minimum acceptance criteria for CPk is now 1.33.     Again, the formula is easy +/-4 sigma divided by 3  =  1.33 CPk, minimum.
     
     
    Note: Larry Sullivan went on to establish the American Supplier Institute, a fantastic training organization.
     
     
    During the early 1980’s General Motors was using Machine Capability Ratios (MCR) and Process Capability Ratios (PCR).   Most people associated using the CPk with Ford Motor Company.    There were some errors in using the General Motors formulas, primarily in the calculation of unilateral tolerances.   Since we did not think G.M. would want to implement the Ford CPk standards, we corrected the G.M. formulas so they could keep their MCR and PCR identity.    However, to my surprise, G.M., and now the world has incorporated the CPk as the standard.   It is so much simpler to have one standard to work with, we should all be happy for that.
       
     

    0
    #159686

    Mikel
    Member

    Mr Tipton,You may be correct about the origins and the original notations,
    but you are way out of date on both the formulas and notations as
    they have been used at least since the AIAG standards have been in
    place. Some differences -Cp, not CP
    Cpk, not CPk
    Pp, not PP
    Ppk, not PPk
    Cp and Cpk are now short term (I know you were correct about Pp
    and Ppk once being short term in Ford dogma)
    Pp and Ppk are now long term.Reference page 80 of the AIAG SPC book where these terms are
    defined.A sample size for a short term study is also much bigger than you
    stated.

    0
    #159889

    Robert Tipton
    Member

    I would like to make a couple of clarification points, regarding my previous E-Mail and the subject matter.
    The original abbreviation and definition of “CPk” was correctly stated.   “CPk” stood for Capability Process (CP) as an index number (k).   CPk was abbreviated as I have shown it.  
    Over time this has changed to many versions, such as; cpk, Cpk, CPK.   The AIAG SPC training manual shows typically shows is as “Cpk”.   Things do tend to change over time.    Based upon the original meaning, I will continue to use CPk.   However, it is a personal choice, based upon history.
    PPk follows this line of thought.    PPk originally stood for Process Potential study, or Preliminary Process study.  
    The intention was to show a higher expectation for short term capability studies in an effort to better predict long term studies.  
    Typically, the default value for PPk is a minimum of 1.67 and a minimum of 1.33 for CPk.
    The reason for this is that there will most assuredly be more variation over time, than will be discovered in a short term study.
    Therefore PPk should be greater than 1.67 and CPk greater than 1.33 CPk.    This really equals one standard deviation of difference.
    If you have a +/- 5 Sigma for a short term study, you will attain an approximate +/- 4 Sigma for a long term study.
     This was the original intention of PPk, and CPk and still is a logical path today.
    In conclusion, short term studies should be using PPk and long term studies should be using CPk.   Definition of short term and long term has some variation in the sample size, and length of time.
    I hope this helps other people to effectively use these tools.
    Best Regards,
    Robert Tipton
     

    0
    #159892

    Jim Ace
    Participant

    Robert Tipton, your recent post concerning process capability ratios was nicely written, but there is an arguable error contained within your description.  Your post stated that: In conclusion, short term studies should be using PPk and long term studies should be using CPk. 
    I believe the generally accepted interpretation of capability ratios within the larger community of six sigma practitioners can be generally characterized by the following descriptions.
    1. Cp and Cpk represent short term capability, also called within group capability.  These two capbility metrics rely soley on Rbar/d2 or the within group sums of squares to calculate the standard deviation (See Minitab Help manual).  Calculating in this way assures that the between group variations are excluded from influencing the size of the standard deviation.  In this way the resulting capability metrics (Cp and Cpk) represent the best possible condition of the process.
    Cp = (USL – LSL)/6S.within
    Cpk = Cp(1-k)
    2. Pp and Ppk represent long term capability, also called overall capability.  These two capbility metrics rely on the total sums of squares, not just the within group sums of squares.  This mean the within group AND between group sums of squares are used to estimate the standard deviation.  In this way the overall standard deviation (that is, the long term standard deviation) includes all types of variations that occured over the total period of sampling, not just those variations that occured within groups.  In this way the resulting capability metrics (Pp and Ppk) represent the actual capability of the process because random causes and assignable causes are used to estimate capability.
    Pp = (USL – LSL)/6S.total
    Ppk = Ppk = Cp(1-k)
    You will find that the reference manual contained within Minitab references Cp and Cpk as potential metrics.  This manual also indicates that Cp and Cpk are soley dependent on within group variations, whereas Pp and Ppk are overall metrics that rely on the total variations.  Generally you will find that:
    1. Cp will be greater than or equal to Cpk. 
    2. Pp will be greater than or equal to Ppk. 
    3. Pp will be greater than or equal to Cp.
    Althrough I have restricted my frame of reference for this post, there are other metrics like Cpm that should be considered.
    Jim Ace

    0
    #159893

    JENNIFER
    Participant

    with no disprespect to the previous post, I find it troublesome that we insist on making what should be an easy concept difficult for the masses.  Cpk is now used for short-term, Ppk for long term studies.  While both sides make compelling arguments, go with the modern definitions. 
     
     

    0
    #159895

    Greg Kao
    Participant

    Jim Ace, you have great knowledge in process capability. your article let six sigma fan understand process capability more clearly. Meanwhile, I think you make a typing mistake.
    In your article,
    “Pp will be greater than or equal to Cp” should be modified to
    “Cp will be greater than or equal to Pp”
     

    0
    #159898

    Jim Ace
    Participant

    Greg Kao, you are totally correct! I made a mistake, and it makes me feel a little silly.  I even read the post 3 times before hitting the Post Message button.  I guess this just reinforces the idea that several heads are far smarter than one!  Great catch and thank you very much.  The wording should have been:
     
    Cp will be greater than or equal to Cpk
    Pp will be greater than or equal to Ppk
    Cp will be greater than or equal to Pp
     
    Again, I appreciate your attention to detail.
    Jim Ace

    0
    #159924

    Savage
    Participant

    I would prefer that this be written as:
    Cp will be always be greater than or equal to Cpk
    Pp will be always be greater than or equal to Ppk
    Cp will be usually be greater than Pp (There are cases where Pp > Cp.) Similarly, Cpk is usually greater than Ppk, but not always.

    0
    #159925

    Hardik
    Participant

    “Cp will be always be greater than or equal to Cpk” This statement is not entirely true!!No offenses.It holds good only for bilateral tolerance and normal data.Either the tolerance is unilateral or your distribution is skewed, Cp value would be lesser then Cpk.With due respect,
    Hardik

    0
    #159927

    Savage
    Participant

    Hardik,
    No offense taken. Thank you for your comments.
    My opinion is that when you have a unilateral tolerance, Cp is either missing, unnecessary to calculate, or equal to Cpk. I suggest this since the only difference between Cp and Cpk is process centering, when only one specification exist, your process is centered between the specs. Also since Cp = [(USL – LSL) / (6 * Sigma)] where Sigma is calculated is [R-bar (or MR-bar)] /d2. So, if we agree on this formula, how should Cp be calculated when one parameter is missing?
    Matt

    0
    #159928

    Hardik
    Participant

    Matt,Thanks for pointing it out.You are correct. All SQC Manuals says the same.In fact the Cp value does not matter for Unilateral tolerance.However what i have seen from my experience is when you have unilateral tolerance or maximum material condition for some dimension, lot of CMM reports give Cp value.In all those cases Cp value is lesser then Cpk.
    This is the way they calculate it.
    Nominal=0
    USL=0.75
    hence it’s unilateral tolerance Take LSL=0
    Moreover you can’t deny the fact that skewed distributions have Cp lesser than Cpk.Let me know till what extent my understanding is true!!Cheers,
    Hardik

    0
    #159929

    Savage
    Participant

    Yikes!
    “the way they calculate it. Nominal=0 USL=0.75 hence it’s unilateral tolerance Take LSL=0”
    Wow. They are artificially creating a LSL when it should be missing.
    For example, if they set the LSL=0 and the X-bar of the process is 0.375, no harm is done. But if the process X-bar is closer to 0 (better) the Cpk would get worse. In other words, as the process improves, the Cpk would get worse. This doesn’t make much sense to me. I would ask the mfg. of the CMM to not add an artificial tolerance, such as 0, when it should be left missing.
    I guess the problem is that setting the LSL=0 is wrong because it would change the result for Cpk & Ppk when the process average gets more than half the distance between the USL and the LSL.
    Just my 2¢

    0
    #159931

    Hardik
    Participant

    Yeah Matt,I totally agree with you.I found mean deviation value on the same report I am referring to. Thats something new to me. Do you have any idea about it? Is there any relation between mean deviation and mean?Awaiting your valuable response.

    0
    #159948

    Savage
    Participant

    No, I am not familiar with the term “mean deviation.” Perhaps others in this forum can help.
    The following may be unrelated, but I will see if this might be what they are referring to.
    I am familiar with “mean” and “standard deviation” and when you take (standard deviation / mean) this is often referred to as the coefficient of variation.
    The coefficient of variation is a measure of how much variation exists in relation to the mean. Another way to describe it is a measure of the significance of the sigma in relation to the mean. The larger the coefficient of variation, the more significant the sigma, relative to the mean.

    0
    #159949

    accrington
    Participant

    Type ‘mean deviation’ in the with the exact phrase box in Google’s advanced search. You’ll get about 397,000 entries.
    Good Luck

    0
    #159950

    Hardik
    Participant

    Matt,I googled it and found the following results.Thought of sharing with you.http://en.wikipedia.org/wiki/Absolute_deviationhttp://mathworld.wolfram.com/MeanDeviation.htmlMoreover on the topic of unilateral tolerance, I agreed with you as a statistician but when I think as an engineer , it sounds weired to me.Let’s say there is nominal dimension of 10mm with positive tolerance of 1 mm, 10+1.In this case, you can have your dimension falling anywhere between 10 to 11 but can’t have dimension lesser then 10.Hence 10 is your LSL, isn’t it ?

    0
    #159951

    Hardik
    Participant

    Thanks, Accrington for your advice!!

    0
    #159954

    Savage
    Participant

    I agree that if the target is 10.0 and you are allowed + 1.0, then the USL=11.0 and the target = 10.0. If, in addition, anything less than 10.0 is unacceptable, then the LSL = 10.0. So the specifications would be LSL=10.0, Target=10.0, USL=11.0

    0
    #166187

    Mahesh Patil
    Participant

    You are making it opposite.
    Cpk is Long Term while Ppk is Short term
     

    0
    #166195

    Lolita
    Participant

    Correct

    0
    #166202

    Taylor
    Participant

    Mahesh & Lolita
    not only have you two answered a post that is 4 YEARS old you have answered it WRONG

    0
    #166211

    Lolita
    Participant

    Are  you sure?
    Please  take  your  time  and  check

    0
    #166226

    GB
    Participant

    Lolita,/Marlon/Dog Sxxt/All Things…
    The data indicates that you are wrong…

    0
    #166324

    Lolita
    Participant

    Is  that all?

    0
    #166329

    Bucking bronco
    Participant

    Lolita is none of those …
    Lolita is either one of the girls with a wet-T shirt, or one of the guys with the water-pistol.

    0
    #171782

    mohammed amrani
    Participant

    pouvez vous m’expliquer qu’est ce que je dois faire pour étudier la capabilité d’une produit ou d’un process orsqu’il a une seul tolérance (mini)

    0
    #175331

    Chugh
    Participant

    This is very confusing sir!
    i am requesting all of you to please make one final dicision & then communicate.
     
    Manish

    0
    #175333

    Craig
    Participant

    These are 2 indices for estimating process capability from historical data. The tricky part is how to estimate the process standard deviation.
    Cpk uses only the within subgroup method of estimating the standard deviation
    Ppk uses within and between variation to estimate standard deviation. (Drive your process to a stable condition, and use this to show your process capability). Even a stable process will show movement of the mean, and why would you not want to encompass all the variation when you estimate standard deviation for process capability reporting?
    I personally think the short term / long term mumbo-jumbo just confuses things. These indices are simply based on two methods for estimating standard deviation from historical data.
    Conceptually the “within” subgroup method makes sense for control charts, which will make them sensitive enough to detect assignable causes. I am sure that Shewhart would be glad that I agree with him (ha ha).
     

    0
    #179758

    anil yadav
    Participant

    Any One Could Tell Me When I Should Use Cpk or Ppk

    0
    #179763

    Darth
    Participant

    Ppk is used to get a sense of long term…whatever that means…capability while Cpk looks at short term…whatever that means…variation. I would suggest using Cpk.

    0
    #186738

    Rookie
    Member

    I think you ment vice versa.
    If you look a few rows before: “Process Performance is only used when process control cannot be evaluated. An example of this is for a short pre-production run. “
    And also this is how it results when an analysis is performed in QS STAT.

    0
    #186739

    S
    Participant

    Pp and Ppk are called performance indices The performance is measured on quick test means if a batch is ready parts are produced and now if you take a random sample and assess T/6s or USL-xbar/ 3s , Xbar-LSL/3s it is Pp and Ppk. No assessment of trends, run, outlier necessary( Stability not checked) No prediction of process behavior possible hence to minimize risk and have safety margin Value is min 1.67. The distribution need not be normal . Std deviation ia assessed of individual sample s= (Sum(Xbar-x)/(n-1))^1/2
    Cp Cpk are capability indices based on long term only assessed for stable corrected process ( the sampling technique is random systematic sampling after monitoring and correcting the process by elimination of special causes) The capability is calculated for stable process in which the variation exist only for common cause and having a natural behavior the distribution is close to normal.hence std deviation is evaluated on based on sample distribution).For predicting process behavior this index is used.
     

    0
    #186741

    Mikel
    Member

    Wrong

    0
    #186742

    Isaac
    Participant

    Wow this is still going?
    Are you all aware that way back, the consultants at Ford started using Cpk as long term and Ppk as short term.
    The rest of the world used Cpk as short term and Ppk as long term?
    I am too young to be able to comment on who was right and who was wrong.  I only know that before my time there was a split in how the nomenclature was used.  Hence all this horrible confusion.
    The calculation for Cpk in general uses a pooled standard deviation of the subgroups.  This in effect gives you a standard deviation as if all subgroups had the same average eg: as if they where in control (short explanation).
    The Ppk in general uses the standard deviation calculated from all the data.  This standard deviation being affected by shifts in the process and special causes. 
    IF the process is stable then Cpk and Ppk will appear similar. 
    Now as a flippant aside, Cpk and Ppk are one sided measures, they only care about the closest specification limit and not where the other limit is found. 
    Zbench is a much better number to use instead becuase it takes into account the position of both specification limits.
    Besides you should be looking at the control charts, and the capability histograms before any decision is being made, that way you have a visual check of the numbers reported and you can make decisions on what the process stability is like, and where the process is.

    0
    #186748

    Tan
    Member

    Z-bench is nonsense

    0
    #186788

    Severino
    Participant

    Cpk can understimate the total variation if the between subgroup variation is high.  Therefore, I always use Ppk.  Always ensure your process is in-control and the data are normally distributed before reporting the index though. 

    0
    #187104

    Rishi Hans
    Member

    Cpk stands for Continuous Process Kurtosis
    Rishi Hans

    0
    #187105

    Rishi Hans
    Member

    Cpk stands for CONTINUOUS PROCESS KURTOSIS

    0
    #187107

    Anonymous
    Guest

    I don’t have time to read through all the replies, but the k stands for katayori which I believe means deviation in Japanese. I think the word kurtosis is Greek and means bulge (shape.)
    Do you know what the word fiat means? :-)

    0
    #187115

    Darth
    Participant

    CPK California Pizza Kitchen
    CPK Creatine Phosphokinase (muscle enzyme)
    Cpk Process Capability (statistical process control measurement of process capability)
    Cpk Process Capability (statistical measurement of how capable a process)
    Cpk Capability Index
    CPK Center for Personkommunikation
    CPK Cabbage Patch Kid
    CPK Corey, Pauling, Koltun (molecular modeling coloring scheme)
    CPK Chesapeake Regional Airport (airport code, VA)
    CPK Crystal Park (Crystal City, Arlington, VA)
    CPK Circuit Pack (Nortel)
    CPK Common Platform Kit (Pirelli)
    CPK Coefficient of Process Capability

    0
    #187116

    Darth
    Participant

    Butler and others dealt with this back in 2007 from an original post in 2005. I think the latest post is a plant from Webby to get us all agitated again.https://www.isixsigma.com/forum/showmessage.asp?messageID=123559

    0
    #187119

    Stevo
    Member

    I once worked on a project that had a huge CPK (Cabbage Patch Kid) problem as well as employee issues.  Luckily it was in a 3rd world country, so we got creative and now I’m responsible for many of the shoes you wear.
     
    Six sigma at it’s best.
     
    Stevo

    0
Viewing 93 posts - 1 through 93 (of 93 total)

The forum ‘General’ is closed to new topics and replies.