For many years industries have used *C*_{p}, *C*_{pk}, *P*_{p} and *P*_{pk} as statistical measures of process quality capability. Some segments in manufacturing have specified minimal requirements for these parameters, even for some of their key documents, such as advanced product quality planning and ISO/TS-16949. Six Sigma, however, suggests a different evaluation of process capability by measuring against a sigma level, also known as sigma capability.

Incorporating metrics that differ from traditional ones may lead some companies to wonder about the necessity and adaptation of these metrics. It is important to emphasize that traditional capability studies as well as the use of sigma capability measures carry a similar purpose. Once the process is under statistical control and showing only normal causes, it is predictable. This is when it becomes interesting for companies to predict the current process’s probability of meeting customer specifications or requirements.

### Capability Studies

Traditional capability rates are calculated when a product or service feature is measured through a quantitative continuous variable, assuming the data follows a normal probability distribution. A normal distribution features the measurement of a mean and a standard deviation, making it possible to estimate the probability of an incident within any data set.

The most interesting values relate to the probability of data occurring outside of customer specifications. These are data appearing below the lower specification limit (LSL) or above the upper specification limit (USL). An ordinary mistake lies in using capability studies to deal with categorical data, turning the data into rates or percentiles. In such cases, determining specification limits becomes complex. For example, a billing process may generate correct or incorrect invoices. These represent categorical variables, which by definition carry an ideal USL of 100 percent error free processing, rendering the traditional statistical measures (*C*_{p}, *C*_{pk}, *P*_{p} and *P*_{pk}) inapplicable to categorical variables.

When working with continuous variables, the traditional statistical measures are quite useful, especially in manufacturing. The difference between capability rates (*C*_{p} and *C*_{pk}) and performance rates (*P*_{p} and *P*_{pk}) is the method of estimating the statistical population standard deviation. The difference between the centralized rates (*C*_{p} and *P*_{p}) and unilateral rates (*C*_{pk} and *P*_{pk}) is the impact of the mean decentralization over process performance estimates.

The following example details the impact that the different forms of calculating capability may have over the study results of a process. A company manufactures a product that’s acceptable dimensions, previously specified by the customer, range from 155 mm to 157 mm. The first 10 parts made by a machine that manufactures the product and works during one period only were collected as samples during a period of 28 days. Evaluation data taken from these parts was used to make a Xbar-S control chart (Figure 1).

This chart presents only common cause variation and as such, leads to the conclusion that the process is predictable. Calculation of process capability presents the results in Figure 2.

### Calculating *C*_{p}

The *C*_{p} rate of capability is calculated from the formula:

where *s* represents the standard deviation for a population taken from , with *s-bar* representing the mean of deviation for each rational subgroup and c_{4} representing a statistical coefficient of correction.

In this case, the formula considers the quantity of variation given by standard deviation and an acceptable gap allowed by specified limits despite the mean. The results reflect the population’s standard deviation, estimated from the mean of the standard deviations within the subgroups as 0.413258, which generates a *C*_{p} of 0.81.

### Rational Subgroups

A rational subgroup is a concept developed by Shewart while he was defining control graphics. It consists of a sample in which the differences in the data within a subgroup are minimized and the differences between groups are maximized. This allows a clearer identification of how the process parameters change along a time continuum. In the example above, the process used to collect the samples allows consideration of each daily collection as a particular rational subgroup.

The *C*_{pk} capability rate is calculated by the formula:

considering the same criteria of standard deviation.

In this case, besides the variation in quantity, the process mean also affects the indicators. Because the process is not perfectly centralized, the mean is closer to one of the limits and, as a consequence, presents a higher possibility of not reaching the process capability targets. In the example above, specification limits are defined as 155 mm and 157 mm. The mean (155.74) is closer to one of them than to the other, leading to a *C*_{pk} factor (0.60) that is lower than the *C*_{p} value (0.81). This implies that the LSL is more difficult to achieve than the USL. Non-conformities exist at both ends of the histogram.

### Estimating *P*_{p}

Similar to the *C*_{p} calculation, the performance *P*_{p} rate is found as follows:

where *s* is the standard deviation of all data.

The main difference between the *P*_{p} and *C*_{p} studies is that within a rational subgroup where samples are produced practically at the same time, the standard deviation is lower. In the *P*_{p} study, variation between subgroups enhances the *s* value along the time continuum, a process which normally creates more conservative *P*_{p} estimates. The inclusion of between-group variation in the calculation of *P*^{p} makes the result more conservative than the estimate of *C*^{p}.

With regard to centralization, *P*_{p} and *C*_{p} measures have the same limitation, where neither considers process centralization (mean) problems. However, it is worth mentioning that *C*_{p} and *P*_{p} estimates are only possible when upper and lower specification limits exist. Many processes, especially in transactional or service areas, have only one specification limit, which makes using *C*_{p} and *P*_{p} impossible (unless the process has a physical boundary [not a specification] on the other side). In the example above, the population’s standard deviation, taken from the standard deviation of all data from all samples, is 0.436714 (overall), giving a *P*_{p} of 0.76, which is lower than the obtained value for *C*_{p}.

### Estimating *P*_{pk}

The difference between *C*_{p} and *P*_{p} lies in the method for calculating s, and whether or not the existence of rational subgroups is considered. Calculating *P*_{pk} presents similarities with the calculation of *C*_{pk}. The capability rate for *P*_{pk} is calculated using the formula:

Once more it becomes clear that this estimate is able to diagnose decentralization problems, aside from the quantity of process variation. Following the tendencies detected in *C*_{pk}, notice that the *P*_{p} value (0.76) is higher than the *P*_{pk} value (0.56), due to the fact that the rate of discordance with the LSL is higher. Because the calculation of the standard deviation is not related to rational subgroups, the standard deviation is higher, resulting in a *P*_{pk} (0.56) lower than the *C*_{pk} (0.60), which reveals a more negative performance projection.

### Calculating Sigma Capability

In the example above, it is possible to observe the incidence of faults caused by discordance, whether to the upper or lower specification limits. Although flaws caused by discordance to the LSL have a greater chance of happening, problems caused by the USL will continue to occur. When calculating *C*_{pk} and *P*_{pk}, this is not considered, because rates are always calculated based on the more critical side of the distribution.

In order to calculate the sigma level of this process it is necessary to estimate the Z bench. This will allow the conversion of the data distribution to a normal and standardized distribution while adding the probabilities of failure above the USL and below the LSL. The calculation is as follows:

- Above the USL:

- Below the LSL:

Summing both kinds of flaws produces the following result:

(Figure 3)

The calculation to achieve the sigma level is represented below:

*Sigma level = Z _{bench }+ 1.5 = 1.51695 + 1.5 = 3.1695*

There is great controversy about the 1.5 deviation that is usually added to the sigma level. When a great amount of data is collected over a long period of time, multiple sources of variability will appear. Many of these sources are not present when the projection is ranged to a period of some weeks or months. The benefit of adding 1.5 to the sigma level is seen when assessing a database with a long historical data view. The short-term performance is typically better as many of the variables will change over time to reflect changes in business strategy, systems enhancements, customer requirements, etc. The addition of the 1.5 value was intentionally chosen by Motorola for this purpose and the practice is now common throughout many sigma level studies.

### Comparing the Methods

When calculating *C*_{p} and *P*_{p}, the evaluation considers only the quantity of process variation related to the specification limit ranges. This method, besides being applicable only in processes with upper and lower specification limits, does not provide information about process centralization. At this point, *C*_{pk} and *P*_{pk} metrics are wider ranging because they set rates according to the most critical limit.

The difference between *C*_{p} and *P*_{p}, as well as between *C*_{pk} and *P*_{pk}, results from the method of calculating standard deviation. *C*_{p} and *C*_{pk} consider the deviation mean within rational subgroups, while *P*_{p} and *P*_{pk} set the deviation based on studied data. It is worth working with more conservative *P*_{p} and *P*_{pk} data in case it is unclear if the sample criteria follow all the prerequisites necessary to create a rational subgroup.

*C*_{pk} and *P*_{pk} rates assess process capability based on process variation and centralization. However, here only one specification limit is considered, different from the sigma metric. When a process has only one specification limit, or when the incidence of flaws over one of the two specification limits is insignificant, sigma level, *C*_{pk} and *P*_{pk} bring very similar results. When faced with a situation where both specification limits are identified and both have a history of bringing restrictions to the product, calculating a sigma level gives a more precise view of the risk of not achieving the quality desired by customers.

As seen in the examples above, traditional capability rates are only valid when using quantitative variables. In cases using categorical variables, calculating a sigma level based on flaws, defective products or flaws per opportunity, is recommended.

**References**

Breyfogle, Forrest W., *Implementing Six Sigma: Smarter Solutions Using Statistical Methods,* New York: Wiley & Sons, 1999.

Montgomery, Douglas C., *Introdução ao Controle Estatístico da Qualidade,* New York: Wilwy & Sons, 2001.

Best, concise explanation of process variation indexes I have ever read–great job!

May I know??

How to calculate Cp with Control Chart esp. Attribute Control Chart??

The formula for Pp is wrong. This article lists

(USL – LSL)/ 3s, where the real formula is (USL – LSL)/ 6s

Also the formula for Ppk references sigma hat, which indicates that standard deviation is being estimated. In reality, Ppk uses actual population standard deviation, or s.

Wow, improper formulas for both Pp AND Ppk. Get it together.

can any one help how to make graph of process capability in excel like Figure 2: Process Capability of Dimension

When evaluating process capabilities are the Cpk/Ppk values considered “valid” if the data is “”not normal”

This summary is very similar to others, which just put in all the terms, still not clear on Cpk to Ppk comparison and how Ppk standard deviation is calculated, nor on C4.

Is it possible to calculate capability when there is no USL, only LSL?

Please explain why Pp and Cp are referred to as “centralized” rates, and Cpk and Ppk are referred to as “unilaterial” rates in paragraph 5 of this article. When you examine the relationships for both Cpk and Ppk compare the mean, a central tendancy stat, to one side or the other indicating “center”. Pp and Cp relationship, taking full limit ranges indicate unilateral dispersion of the process within the customer spec window.

There is an error in the Ppk formula. Divide by 3s rather than 3σ. This is the Cpk formula!

I have question , can you help please

what is the Ppk for process have a spread 24 and average 68 and upper limit 82 and lower limit 54

– 1.68

– 2

– 4

– 4.42

Hi ………….

I am from biology background but would want to know what is difference between process capability and process capability index. What are their area of applications. Please clarify me.

Great article and refresher on calculating sigma levels. Thanks for sharing!