Cp, Cpk, Pp and Ppk: Know How and When to Use Them

For many years industries have used C_p, C_pk, P_p and P_pk as statistical measures of process quality capability. Some segments in manufacturing have specified minimal requirements for these parameters, even for some of their key documents, such as advanced product quality planning and ISO/TS-16949. Six Sigma, however, suggests a different evaluation of process capability by measuring against a sigma level, also known as sigma capability.

Incorporating metrics that differ from traditional ones may lead some companies to wonder about the necessity and adaptation of these metrics. It is important to emphasize that traditional capability studies as well as the use of sigma capability measures carry a similar purpose. Once the process is under statistical control and showing only normal causes, it is predictable. This is when it becomes interesting for companies to predict the current process’s probability of meeting customer specifications or requirements.

Capability Studies

Traditional capability rates are calculated when a product or service feature is measured through a quantitative continuous variable, assuming the data follows a normal probability distribution. A normal distribution features the measurement of a mean and a standard deviation, making it possible to estimate the probability of an incident within any data set.

The most interesting values relate to the probability of data occurring outside of customer specifications. These are data appearing below the lower specification limit (LSL) or above the upper specification limit (USL). An ordinary mistake lies in using capability studies to deal with categorical data, turning the data into rates or percentiles. In such cases, determining specification limits becomes complex. For example, a billing process may generate correct or incorrect invoices. These represent categorical variables, which by definition carry an ideal USL of 100 percent error free processing, rendering the traditional statistical measures (C_p, C_pk, P_p and P_pk) inapplicable to categorical variables.

When working with continuous variables, the traditional statistical measures are quite useful, especially in manufacturing. The difference between capability rates (C_p and C_pk) and performance rates (P_p and P_pk) is the method of estimating the statistical population standard deviation. The difference between the centralized rates (C_p and P_p) and unilateral rates (C_pk and P_pk) is the impact of the mean decentralization over process performance estimates.

The following example details the impact that the different forms of calculating capability may have over the study results of a process. A company manufactures a product that’s acceptable dimensions, previously specified by the customer, range from 155 mm to 157 mm. The first 10 parts made by a machine that manufactures the product and works during one period only were collected as samples during a period of 28 days. Evaluation data taken from these parts was used to make a Xbar-S control chart (Figure 1).

Figure 1: Xbar-S Control Chart of Evaluation Data

This chart presents only common cause variation and as such, leads to the conclusion that the process is predictable. Calculation of process capability presents the results in Figure 2.

Figure 2: Process Capability of Dimension

Calculating C_p

The C_p rate of capability is calculated from the formula:

where s represents the standard deviation for a population taken from , with s-bar representing the mean of deviation for each rational subgroup and c₄ representing a statistical coefficient of correction.

In this case, the formula considers the quantity of variation given by standard deviation and an acceptable gap allowed by specified limits despite the mean. The results reflect the population’s standard deviation, estimated from the mean of the standard deviations within the subgroups as 0.413258, which generates a C_p of 0.81.

Rational Subgroups

A rational subgroup is a concept developed by Shewart while he was defining control graphics. It consists of a sample in which the differences in the data within a subgroup are minimized and the differences between groups are maximized. This allows a clearer identification of how the process parameters change along a time continuum. In the example above, the process used to collect the samples allows consideration of each daily collection as a particular rational subgroup.

The C_pk capability rate is calculated by the formula:

considering the same criteria of standard deviation.

In this case, besides the variation in quantity, the process mean also affects the indicators. Because the process is not perfectly centralized, the mean is closer to one of the limits and, as a consequence, presents a higher possibility of not reaching the process capability targets. In the example above, specification limits are defined as 155 mm and 157 mm. The mean (155.74) is closer to one of them than to the other, leading to a C_pk factor (0.60) that is lower than the C_p value (0.81). This implies that the LSL is more difficult to achieve than the USL. Non-conformities exist at both ends of the histogram.

Estimating P_p

Similar to the C_p calculation, the performance P_p rate is found as follows:

where s is the standard deviation of all data.

The main difference between the P_p and C_p studies is that within a rational subgroup where samples are produced practically at the same time, the standard deviation is lower. In the P_p study, variation between subgroups enhances the s value along the time continuum, a process which normally creates more conservative P_p estimates. The inclusion of between-group variation in the calculation of P^p makes the result more conservative than the estimate of C^p.

With regard to centralization, P_p and C_p measures have the same limitation, where neither considers process centralization (mean) problems. However, it is worth mentioning that C_p and P_p estimates are only possible when upper and lower specification limits exist. Many processes, especially in transactional or service areas, have only one specification limit, which makes using C_p and P_p impossible (unless the process has a physical boundary [not a specification] on the other side). In the example above, the population’s standard deviation, taken from the standard deviation of all data from all samples, is 0.436714 (overall), giving a P_p of 0.76, which is lower than the obtained value for C_p.

Estimating P_pk

The difference between C_p and P_p lies in the method for calculating s, and whether or not the existence of rational subgroups is considered. Calculating P_pk presents similarities with the calculation of C_pk. The capability rate for P_pk is calculated using the formula:

Once more it becomes clear that this estimate is able to diagnose decentralization problems, aside from the quantity of process variation. Following the tendencies detected in C_pk, notice that the P_p value (0.76) is higher than the P_pk value (0.56), due to the fact that the rate of discordance with the LSL is higher. Because the calculation of the standard deviation is not related to rational subgroups, the standard deviation is higher, resulting in a P_pk (0.56) lower than the C_pk (0.60), which reveals a more negative performance projection.

Calculating Sigma Capability

In the example above, it is possible to observe the incidence of faults caused by discordance, whether to the upper or lower specification limits. Although flaws caused by discordance to the LSL have a greater chance of happening, problems caused by the USL will continue to occur. When calculating C_pk and P_pk, this is not considered, because rates are always calculated based on the more critical side of the distribution.

In order to calculate the sigma level of this process it is necessary to estimate the Z bench. This will allow the conversion of the data distribution to a normal and standardized distribution while adding the probabilities of failure above the USL and below the LSL. The calculation is as follows:

Above the USL:

Below the LSL:

Summing both kinds of flaws produces the following result:

(Figure 3)

The calculation to achieve the sigma level is represented below:

Sigma level = Z_bench+ 1.5 = 1.51695 + 1.5 = 3.1695

There is great controversy about the 1.5 deviation that is usually added to the sigma level. When a great amount of data is collected over a long period of time, multiple sources of variability will appear. Many of these sources are not present when the projection is ranged to a period of some weeks or months. The benefit of adding 1.5 to the sigma level is seen when assessing a database with a long historical data view. The short-term performance is typically better as many of the variables will change over time to reflect changes in business strategy, systems enhancements, customer requirements, etc. The addition of the 1.5 value was intentionally chosen by Motorola for this purpose and the practice is now common throughout many sigma level studies.

Comparing the Methods

When calculating C_p and P_p, the evaluation considers only the quantity of process variation related to the specification limit ranges. This method, besides being applicable only in processes with upper and lower specification limits, does not provide information about process centralization. At this point, C_pk and P_pk metrics are wider ranging because they set rates according to the most critical limit.

The difference between C_p and P_p, as well as between C_pk and P_pk, results from the method of calculating standard deviation. C_p and C_pk consider the deviation mean within rational subgroups, while P_p and P_pk set the deviation based on studied data. It is worth working with more conservative P_p and P_pk data in case it is unclear if the sample criteria follow all the prerequisites necessary to create a rational subgroup.

C_pk and P_pk rates assess process capability based on process variation and centralization. However, here only one specification limit is considered, different from the sigma metric. When a process has only one specification limit, or when the incidence of flaws over one of the two specification limits is insignificant, sigma level, C_pk and P_pk bring very similar results. When faced with a situation where both specification limits are identified and both have a history of bringing restrictions to the product, calculating a sigma level gives a more precise view of the risk of not achieving the quality desired by customers.

As seen in the examples above, traditional capability rates are only valid when using quantitative variables. In cases using categorical variables, calculating a sigma level based on flaws, defective products or flaws per opportunity, is recommended.

References

Breyfogle, Forrest W., Implementing Six Sigma: Smarter Solutions Using Statistical Methods, New York: Wiley & Sons, 1999.

Montgomery, Douglas C., Introdução ao Controle Estatístico da Qualidade, New York: Wilwy & Sons, 2001.

Cp, Cpk, Pp and Ppk: Know How and When to Use Them

Capability Studies

Calculating C_p

Rational Subgroups

Estimating P_p

Estimating P_pk

Calculating Sigma Capability

Comparing the Methods

References

About the Author

Daniela Marzagão

Capability Studies

Calculating Cp

Rational Subgroups

Estimating Pp

Estimating Ppk

Calculating Sigma Capability

Comparing the Methods

References

About the Author

Daniela Marzagão

Calculating C_p

Estimating P_p

Estimating P_pk