# How to calculate the Cpk by minitab tool?

Six Sigma – iSixSigma Forums Old Forums General How to calculate the Cpk by minitab tool?

Viewing 10 posts - 1 through 10 (of 10 total)
• Author
Posts
• #33574

wang
Member

Hi, Who can teach me how to analyze the Cpk value for Flatness/Position data by Minitab tool? As you know the data of flatness & position is only one side tolerance, so how to do ? Pls.tell me and explain !Thanks for your help !wang

0
#91010

Doc
Participant

Enter the tolerance limit you DO have (upper or lower) and then leave the other tolerance limit field blank (lower or upper). Minitab will do the right calculations.

0
#91026

Dieter
Participant

If your flatness and position data are close to normal distribution (with Minitab 13). Do a test first and if necessary you’d better do a box cox transformation and enter transformation factor in options dialog.

0
#91045

wang
Member

Thanks! Yes, the normal test result show that it isn’t normal. I should translate the data with “box-cox” tool. Then analyze it with best lamta value.BTW can i analyze my data with weibull tool? Someone told me the two method can be used with non-normal data. But I want know whitch one is better?Box-cox or Weibull ?
Thanks again !

0
#91056

Doc
Participant

Dieter brought up a very important point. It is critial that you assess the normality of the data. For most situations this is likely easiest and more intuitive done using a probablity plot (Stat > Basic Statistics > Normality test), but you can also use the Anderson-Darling normality test. If the data points fall roughly on a straight line, you can assume the data are normal. If the p-value for the AD test is greater than 0.05, then you can also assume normality, BUT keep in mind that for very large sample sizes the AD test can get picky about normality and tend to say even fairly normally distributed data are not normal. That is why I tend to prefer the normal probability plot.
You also need to assess the stability of the process. This is best done using a control chart. Watch for out-of-control points.
If you data are not normally distributed then you have several choices:
1. If using a mean chart, you could increase your sub-group sample size so that the means are “driven” to normality (via the central limit theorem).
2. Try to find another distribution that fits your data. Use the Weibull Capability Sixpack to check the probability plot, check the control chart, and then assess capablity. Minitab Release 14 now provides a distribution identification tool and provides capability analysis for large number of other distributions (13 I think) in addition to the Weibull.
3. Use the Box-Cox transformation to try to normalize your data. Find a lambda value that is inside the red confidence interval lines. Some use the value listed next to “Est” – this is the best value. Others prefer to use a “nice” lambda that is still inside the confidence interval such as 0.5 (square root), 2, -1 (ln), …. Then enter the lambda you chose into the capability analysis using the Options button.
WARNING – I wish the Box-Cox transformation tool also offered a probablity plot of the transformed data so the user could assess the new fit. Make sure you store the Box-Cox transformed data and run it through the Normality test tool, just to be sure the new tranformed data are truly normal.
4. Minitab Release 14 now provides a Johnson transformation that seems to do a better job of normalizing data, though the form of the transformation is messier than that of the Box-Cox transformation. The output for this transformation does show a transformed data probability plot. It is easier to use too. Just click on the option in the main Nonnormal Capability Analysis dialog box.
5. If your sample sizes are large enough, you theoretically could use empirical (nonparametric) percentiles to caclulate the capability indices. Minitab doesn’t offer a tool for this, likely because most users simply will not have the needed sample sizes (1000? 10,000?)
None of these methods is better than the other. Increasing sample size costs money. The distribution-based analysis is somewhat easier and “cleaner”, but it does not provide both Cpk and Ppk metrics. Transformation don’t always work and can be somewhat confusing to customers. Use of empirical percentiles requires VERY large sample sizes.

0
#91058

Statman
Member

Doc,
Normality tests that you are suggesting assume that the sample is an independent random sample from a defined population.  Since process data is sequential and subject to possible instability (i.e. the mean and/or variation is changing due to outside influences), if you reject the null, is it not quite possible that the test failed due to special causes and the underlying distribution is normal?
Wouldn’t it be prudent to test for process stability prior to testing for normality?  Would it not be better to assume normaility apriori to testing for stability unless another distribution is suggested by the type of data?
Can a CpK calculated from data that is transformed when the process is not stable but assessed as non-normal be used as a predicitve statistic?

0
#91059

Statman
Member

Doc,
Posted before I completed my thought.
If a process is deemed to be stable, passes all eight of the Western Electric/Nelson run rules, then wouldn’t that already say that the data can be modeled as normal since testing stablitiy is testing that the process fits a normal distribution?
So, if we require stability to test normality, testing normality is redundent.

0
#91063

Spiderman
Member

Statman…
I dont think it is wise to assume normality because a data set is stable.  Tell  me if I am wrong on this Doc….but stability is to ensure we have predictibility, normality is to ensure we are taking into account the right central tendancy and using right measure of variation in the data set.  One cannot be used to determine the other.

0
#91075

Statman
Member

Yes Spiderman,

Stability is to ensure we have predictability. Or another way to put it, a stable process will behave in a predictable manner. But how can we predict how it will behave?  For variables data free of autocorrelation, a stable process can be described by the following Shewhart-Deming model:

yt = m + et

Where et are the deviations from the process mean at time t and are iid as normal with mean = 0 and variance = process variation.  Our knowledge of the behavior of the distribution of et allows for prediction of process behavior.

So when we are testing to see if a process is stable, we are testing to see if it fits the above model.  Therefore, we are testing to see if the deviations can be modeled as normal.

Of course, a process can be stable and not normal.  For example, it can be stable and log normal, but we would want to fit a different model to that data.  There is a tremendous amount of risk involved in determining that the data fits another distribution empirically if we do not know that the process is stable since the non-stability will affect the assessment of normality.  Therefore, if the data meets the requirements of variables data type, we should assume normality a priori to testing stability.  If we have reason to believe that the data type is best modeled with another distribution, then we should choose a different model for the behavior of the process or transform the data so that it can be modeled as normal prior to assessment of stability.

For example, if we are measuring biological oxygen demand (BOD) in a treatment plant effluent.  Prior knowledge of the measurement tells us that it is best modeled as lognormal.  Therefore, we should transform the data or fit a lognormal model before we control chart and test for stability.

Final note, the Box-Cox transformation is a variance stabilizing transformation not a normalizing transformation.  If the variation of the subgroups is a function of the mean level of the subgroup then Box-Cox will suggest a transformation that will make them independent.  However, if the subgroups contain special causes, both the mean and range of the subgroup will be affected.  So the Box-Cox transformation will transform the data to compensate for the special cause.  This transformation may not be appropriate for prediction of process performance as the special causes may not be predictable.

Statman

0
#91076

Spiderman
Member

Great points Statman!!
I was just trained to check distribution first….and then proceed to check for stability.  And I think you alluded to it…..the distribution will dictate how you will check for the stability.
Question…I am definitely not as strong as you are in the statistical arena (I know enough to be dangerous, but definitely not in your league), can you expand on the different models for behaviors of the process….and which should I use depending on which distribution.  I was taught the basics…..control charts….and which to use depending on type of data. (continuous vs attribute).

Thanks for sharing…!!

0
Viewing 10 posts - 1 through 10 (of 10 total)

The forum ‘General’ is closed to new topics and replies.