NonNormal Data
Six Sigma – iSixSigma › Forums › Old Forums › General › NonNormal Data
 This topic has 23 replies, 14 voices, and was last updated 15 years ago by Chris Seider.

AuthorPosts

May 5, 2006 at 7:02 am #43322
Hey Folks,
Need your help here!
If my data turned out to be a nonnormal after normality test, how can I transform into normal data? I tried box cox transformation through the minitab, but it doesn’t work with negative numbers. Is it the minitab version that has problem or boxcox transformation is not applicable for negative numbers?0May 5, 2006 at 8:59 am #137323
new guyParticipant@newguy Include @newguy in your post and this person will
be notified via email.My understanding is that “transforming” is to be used only if the data is discrete.
In case your data is continuous then you can try using “segmentation” or “sub grouping” for normalizing the data.0May 5, 2006 at 12:07 pm #137326thanks for the comment. any other information from you guys..thanks..
0May 5, 2006 at 12:10 pm #137327Diego,
Don’t automatically assume you can transform the data successfully. Have you looked for outliers with assignable causes? Have you looked for evidence of a multimodal distibution? Sometimes there are a few too many sources of variation and the data does not fit a normal distribution.0May 5, 2006 at 12:30 pm #137328thank you for the response. i got what you mean. but what if i don’t have enough resources to measure a certain process, assuming that taking a measurement is limited for one time or two times only, and my data turned out to be nonnormal, after normality test. i know something about boxcox transformation from the minitab that uses power functions(log, square root, square, inverse, etc.), but they don’t work with negative numbers. i tried running capability analysis using weibull process capability, but still, it doesn’t work with negative data response. i tried normal process capability analysis with boxcox transformation function, but just the same, an error. anyone can explain why, and what minitab function should I use?
0May 5, 2006 at 12:37 pm #137329
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.Three observations:
1. The BoxCox transform won’t work with negative numbers
2. The BoxCox transform is for data from a skewed distribution
2. Transforms are used with both continuous and discrete data
Some comments:
You said “…data turned out to be a nonnormal after normality test”. Your post gives the impression that this is all you have done with respect to checking your data. If this is the case you need to stop and start over.
1. To HACL’s points – plot the data – first, last, and always plot the data. Plot it using a histogram, plot it on a normal probability plot, and plot it any other way that makes physical sense. Plots will allow you to address everything HACL mentioned. Most importantly, the plots will put your normality test in perspective and, if the data fails the test, allow you to make a decision concerning the importance of that failure.
There are many reasonably symmetric distributions which fail a normality test but which can be analyzed with the usual tools without fear of reaching incorrect conclusions.0May 5, 2006 at 3:44 pm #137335
Ernesto GarciaParticipant@ErnestoGarcia Include @ErnestoGarcia in your post and this person will
be notified via email.Hi!Just to support Bob’s statements:1) Box cox does not work data with negative and zero values. If you have negative or zero values add a constant term to all your data points to make them positive.2) Box Cox works for distributions that are unimodal (one peak), and where the ratio between the maximum and the minimum value is greater than 2 or 3 (See Box Hunter and Hunter on this)Thus, when faced with a distribution that has many “peaks” (multimodal) use your process experience and stratify. 3) Minitab has another transformation possible when Box Cox does not work. It is called Johnson’s transformation. I do not know the nature of your data/process, but you might be able to use it.Best regards!Er
0May 15, 2006 at 5:20 am #137609Hi Diego,
If you want to compute process capability value, group your data (grouping size should be atleast 4), and then check for normality.
Still if the data is not normal, then identify the distribution of the grouped data through minitab.
Quality tool>Individual distribution identification
P value with more than 0.5 is good to fit.
Even after the data is non normal, increase the grouping size.
Then use process capability for nonnormal data, use the distribution you have identified in that and compute Cp or Pp
Regards
Sathiya0May 15, 2006 at 9:12 am #137611Sathiya,
Are you sure?
Andy0May 15, 2006 at 9:33 am #137612
S.S.ChoudhuryMember@S.S.Choudhury Include @S.S.Choudhury in your post and this person will
be notified via email.Diego,First check your basic process stability for presence of mixture,cluster,trends and oscillations.You may use minitabs run chart for the above.Sometimes the shape of the normal probability plot indicates lot of things.For example a s shaped curve indicates many populations.Once you are sure that the above factors are not present then only use transformation. As rgards the negative factor normally box cox transformation does not take negative valuse as these are normally for distributions bounded by Zero.However if required in extreme cases you may add to all your readings the highest negative value and do the analysis as this will not disturb the distribution.Be sure to verify the distribution of transformed data befor progressing and converting back the additions before publishing your results.
Regards,
S.S.Choudhury
0May 15, 2006 at 9:43 am #137613Sathiya’s approach is correct. When data fails normality test, individual distribution identification should be applied and then capability analysis can be done for that disctribution.
0May 15, 2006 at 10:04 am #137616I disagree. Try reading his post more carefully. He does not mention a distribution of individuals. To the contrary he implies using the distribution of sampled averages, which is incorrect.
I also disagree with ‘fitting a distribution’ to individual data – even though some others take that vie; and my reasoning for taking this position is process capability is a Japanese invention, and is based on a study of Shewhart Charts, not on some USA academic’s musing.0May 15, 2006 at 9:09 pm #137633
Paul KellerParticipant@PaulKeller Include @PaulKeller in your post and this person will
be notified via email.A couple points that have been missed or even misapplied in the thread I reviewed:
1. You need to know the underlying shape of the process distribution to calculate a meaningful Process Capability index. The standard calculations apply only to a process whose observations are normally distributed. To properly calculate a capability index for nonnormal data, you either need to transform the data to normal, or use special case calculations for nonnormal processes.
2. You should never do a transformation, or calculate Process Capability, until you have determined the process is in the state of statistical control. If the process is not in control, then it is not stable, and cannot be predicted using capability indices. Likewise, an out of control situation is evidence that multiple distributions are in place, so a single transformation for all the process data would be meaningless.
3. Subgrouping the data and analyzing the subgroup averages for normality will not tell you anything about the underlying distribution of the observations. It will prove the Central Limit Theoreom, and allow you to use with confidence the XBar charts to test for process control.
So how do you handle this data?
1. First investigate the process stability using a control chart. We could use an XBar chart with a subgroup size of 5. Why five? The CLT tells us the average of five observations from even pretty nonnormal processes will tend to be normally distributed. You can do a Normality test on these averages to verify. You might also go with a subgroup size 3 if that works, which it often does. Another approach is to use an EWMA chart with your original subgroup size of one (a lambda of 0.4 works well). This chart should handle even nonnormal data well.
2. If the process is out of control, stop there and improve the process. Don’t bother with a capability analysis or with transformation, as they will be meaningless.
3. If the process is in control, then you can estimate capability. You could either transform the data to Normal and use the standard calculations for capability applied to the normalized data, or fit a distribution to the data and calculate the capability using the percentiles of the distribution. The Johnson technique applies this latter approach.
I hope this is helpful. There’s more information on each of these topics in quality america’s Knowledge Center (http://www.qualityamerica.com/knowledgecente/knowctrKnowledge_Center.htm).0May 15, 2006 at 9:59 pm #137636The shape of the distribution is the distribution of sampled means. We all know what it is – it is normal.
Process capability can be determined by ‘back ‘calculating the variance of individuals. This was the orignal method used by Japanese engineers who invented Cp and Cpk.
Just because a USA academic decides to redefine a Japanese metric doesn’t make it right.
In fact, it can be most useful to ‘back calculate’ individual variance since it allows one to compare observed values with the back calculated individual spread to check independence assumptions.
You also claim the optimal subgroup size is five – but Shewhart states that the optimum subgroup size is between 3 and 5. Anyway, the distribution of samples means will depend more on the number of subgroups, g, rather than the difference beteween using 3 or 4, and the 5 you suggest.
There is also the other issue you raised – the question of stability – which is usually determined from a Shewhart Chart. Therefore, the use of a Shewhart Chart should be ‘central’ to any calculation of process capability, and not the distribution of the individuals, since without if there is no way of knowing whether the data is indpeendent, homogeneous or not.
0May 15, 2006 at 10:56 pm #137639
Paul KellerParticipant@PaulKeller Include @PaulKeller in your post and this person will
be notified via email.I think Andy may have misinterpreted some of my comments:
The shape of the distribution of the means is predicted to be Normal by the Central Limit Theorem. In practice, it is often Normal, with larger subgroup sizes tending to be more closely Normal than smaller subgroup sizes. However, the distribution of the underlying observations is still unknown, and it is that distribution which is pertinent to the discussion of capability indices.
I did not mean to suggest that a subgroup size of five is ‘optimal’. I mentioned the subgroup size of three may give similar results. The smaller subgroup size is often preferred for economic considerations. Yet, a subgroup of size five is more likely to be normally distributed if the observations are more nonnormal, so is an acceptable place to start.
Regarding the number of subgroups, the standard control limit calculations using the tabulated control chart “constants” are based on the assumption that there is a large number of subgroups. The “constants” A2, D4, etc. (or their more basic d2, d3 and c4) are really not constant, but depend on the number of subgroups (g). If you had a small number of subgroups, you would see differences in the distribution of the averages based on the number of subgroups at a fixed subgroup size, which is why it is important to have a sufficient number of subgroups to estimate your control limits.
Andy made a statement that is unclear to me: “Process capability can be determined by ‘back ‘calculating the variance of individuals. This was the orignal method used by Japanese engineers who invented Cp and Cpk.”
I think he is refering to the fact that the standard deviation of the observations equals the standard deviation of the averages times the square root of n. Yes, this is part of the accepted method used to calculate process sigma based on the XBar chart’s control limits, and this value of process sigma is used in the capability calculation for a normal distribution. However, that doesn’t help to understand anything about the shape of the distribution of the observations. There are four parameters needed to define any distribution: mean, standard deviation, skewness and kurtosis. We can mathematically “backcalculate” the mean and standard deviation of the observations from the mean and standard deviation of the averages, but we still need to know something about the shape of the distribution of the observations (i.e. skewness and kurtosis).
In the standard calculation of process capability, we consider the plus and minus 3 sigma levels of the process. Why? Well, for the Normal distribution, this will account for 99.73% of the process variation. Yet in a Nonnormal case, plus and minus 3 sigma may provide a completely different level of protection. While the plus three sigma value coincides with the 99.865 percentile of a Normal distribution, the percentile at plus 3 sigma for the nonnormal distribution is dependent on the shape of the distribution. If you consider typical data from a highlyskewed process bounded at zero (such as Cycle Times, Wait Times, TIR, flatness, etc.), you’ll find that a minus three sigma is often a negative number. That’s a clue that the plus three sigma level is unlikely to be the 99.865 percentile. In capability analysis, it is the percentiles we care about, not that they occur at plus and minus three sigma.
Does that make more sense?0May 15, 2006 at 11:57 pm #137642
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.Assuming you have done all of the things you should have done before trying to compute your process capability and assuming that after all of this effort your data is really nonnormal then you don’t have to subgroup and you don’t have to transform. What you need to do is the following:
Take your data and plot it on normal probability paper. Identify the .135 and 99.865 percentile values (Z = +3). The difference between these two values is the span for the middle 99.73% of the process output. This span is the equivalent 6 sigma spread. Use this estimate for the 6 sigma term in the process capability calculations. The capability goal of this spread is to have it equal to .75 x Tolerance.
This is the method recommended in Measuring Process Capability by Bothe. For additional details read Chapter 8 – “Measuring Capability for NonNormal Variable Data.0May 16, 2006 at 2:59 am #137644
Jonathon AndellParticipant@JonathonAndell Include @JonathonAndell in your post and this person will
be notified via email.First of all, plot the data over time to see if they are statistically stable (“common cause”). Special cause variation can create the appearance of nonnormality.If the process appears stable, and if negative numbers are feasible, you may want to offset the data – add the same number to every single data point, so that the entire set is positive. Then I’d suggest Minitab’s distribution ID utility. BoxCox can make data “look” normal, but identifying the distribution is a source of new knowledge.
0December 27, 2006 at 9:39 pm #149575Probably the best lambda is evaluated to 0 by Minitab and in this case the BoxCox transformation is simply the Logarithm which is not defined for negative values in the sample. This is why Minitab is complaining
0December 27, 2006 at 11:41 pm #149578One of six sigma’s many flaws is it’s obsession with normal distributions. This appears to stem from six sigma tables which require data to be normal.
There is no need for data transforms. Control charts don’t need them, nor do histograms. Transformed data loses its meaning. Transforms will also require inverse transforms that become too complex for practical purposes.0December 28, 2006 at 2:33 am #149581Sorry Hal, the first paragraph of your response is wayyyyy offffff. SS doesn’t have an obsession with normality. T tests, ANOVA, capability, Regression and a host of other statistical tools have varying assumptions about normality. You can’t lay that on SS because they make use of the stat tools. Those assumptions were there way before SS came into being.
0December 28, 2006 at 4:01 pm #149598
Chris SeiderParticipant@cseider Include @cseider in your post and this person will
be notified via email.Diego,
Remember that Minitab goes through an iteration of values to decide what power to take your value. Because of the need to not have imaginary numbers, 0.5 and 0 are not mathematically possible if your data is negative.
Two things to try. First, make sure your max is at least twice your min which helps. Second, add a fixed value to all of your data so the data is non zero. Remember this is now part of your transformation which you must do for your specs, etc.0December 28, 2006 at 4:19 pm #149599
Chris SeiderParticipant@cseider Include @cseider in your post and this person will
be notified via email.Hal,
It seems you have had a bad experience or are having a peering over the wall perspective.
I don’t think the very good practictioners of Six Sigma have ever insisted on having a normal process before they can improve the process. The better ones really understand the assumptions and wouldn’t want to misapply a ttest of Ftest and potentially lead to erroneous conclusions.
I’d like to think of the need to apply normal distributions appropriately like using a phillips screwdriver on the appropriate screw head. I’d hate to use a slot edged screwdriver on the wrong distribution of screws.0December 28, 2006 at 9:37 pm #149618With reference to processes, and not surveys of people and other such enumerative studies that are frequently quoted here, how many processes do you actually think are perfectly normally distributed ?
Have you ever used a Pearson lack of fit test ?0December 29, 2006 at 1:37 am #149628
Chris SeiderParticipant@cseider Include @cseider in your post and this person will
be notified via email.I have read some interesting material on not many if no processes are actually normal.
I find them more academic than useful. If the bell curve is closely fitting, and using my favorite test of the Anderson Darling statistic, I will assume it is normal. Any deviation from the perfect normal curve could be any set of reasons like measurement error, not enough samples to accurately predict the population, etc is not enough of a reason to not drive business or personal results using the Six Sigma methodology in my opinion.
I’m curious where you are driving towards….with your last question. Thanks.0 
AuthorPosts
The forum ‘General’ is closed to new topics and replies.