Normal Distributions – why does it matter?
Six Sigma – iSixSigma › Forums › Old Forums › General › Normal Distributions – why does it matter?
 This topic has 15 replies, 12 voices, and was last updated 17 years, 3 months ago by Tierradentro.

AuthorPosts

November 30, 2001 at 3:01 pm #28303
AnonymousParticipant@Anonymous Include @Anonymous in your post and this person will
be notified via email.I am not entirely sure why normal distributions are important. During my BB training, the master BB sometimes says, if you collect data and it is not normal, “you need to collect more data” – What does nonnormal data actually say about the process in question (without any other information, can anything actually be inferred?)What do you do if the data is just not normal, other than a boxcox transformation? — and why can’t we simply analyze the data as it truly is?
0November 30, 2001 at 3:50 pm #70275Actually, normality is vastly overrated. :)
The basic math that drives T tests, and several other tests, was derived from an assumption of a normal distribution. So texts, especially older ones, sometimes insist that you check for normality before you run your test. As it turns out, tests of means are usually very robust to this assumption, and you can almost “violate it with impunity” according to one of the best authors I read.
When you do a 2^K, you make no assumption of normality when you compute sums of squares, degrees of freedom, mean square error, and an F ratio. The assumption of normality comes in when you assign a T score and a P. If you have an F ratio (signal to noise ratio) that is high, you don’t need the T and P. So, if you’re willing to just go with the F ratio, and ignore T and P, you don’t need normality.
Where you are vulnerable is with small samples, and when you run a onetailed test. If you have a large sample size, you are much less vulnerable to error from nonnormality, hence the encouragement to “take more data”.
If your samples are small, you usually can’t tell whether they are non normal. 5 samples from a 2 d.f. Chi Square distribution will fail to be shown to be nonnormal in a high percentage of cases.
Anyway, if you take enough real world data, it will eventually test nonnormal. As your sample size increases, you can detect progressively smaller and less important deviations from normal. And, you can never show that the data are normal in the first place… you can only say that you failed to show that it was nonnormal.
There are places nonnormality is very important, but testing means usually isn’t one of them.0November 30, 2001 at 4:42 pm #70278Your MBB does not have a clue. Go to your management and tell them you need good training.Analysis of nonnormality can be very enlightening. There are some things that should not be expected to be normal (time and one sided bounded specs for example). There are nonnormal, not naturally occuring distributions that tell alot. My training at AlliedSignal Automotive spent considerable time teaching us how to look at this and I consider it on of the most valuable tools I learned.
0November 30, 2001 at 9:38 pm #70300The data, in and of itself, need not be normal. As has been mentioned, there are plenty of situations that generate data that does not conform to normality. However, with that said, we do take great effort to force it take on normal aspects (e.g. averaging data so that we can use it in control charting). The F test for the equality of means (for a fixed effects model) is robust to departures from normality and robust to unequal variances given that there arent large differences in sample size.
In this case, I believe youre talking more to residual analysis. The output that we expect from a DOE is a regressional relationship. In this case: normality of the errors and constancy of the error variance are important factors to consider. The primary reason is that these residuals are our sanity check that the model that were creating Y=b0+b1x1+e is the correct model. If we run a DOE and develop our Y=f(x) equation without looking at these, we may cause people to control factors which truly are not important and we will most likely miss variables that truly should be deemed KPIVs. Of the two, unequal variance should be considered the more important of the two. Remember, they let us get away with a “fat pencil” approach in regards to a linear NPP and hence normality.
In regards to the other part of your question, if we run transformations and we still cannot stabilize the variance then we may need to use nonparametric approaches to analyze the data. A recommendation here would be to use the Rank F test.0November 30, 2001 at 9:46 pm #70301Denton & Stan are right on both counts.
(WARNING – long answer ahead)
For many common statistical tests, such as z & t tests and Ftests in ANOVAs, the assumption is that the means are normally distributed. As Denton mentioned, the raw data don’t really have to be normally distributed (I has posted a question related to this some time ago).
The Central Limit Theorem (although it sounds “theoretical” it is important and pretty simple – I usually teach it to people) is the thing that makes these tests work regardless of the distribution of the raw data. It basically says that regardless of the distribution of the raw data, the mean of data will tend toward a normal distribution as the sample size for the mean gets larger. For symmetric distributions this can happen with sample sizes as small as 5. For wilder, nonsymmetric distributions even a sample size as small as 50 will still result in quite normaly distributed means.
Many textbooks DO say that the raw data have to be normally distributed for ttests, but to my knowledge that is not true. In my opinion you just need largish sample sizes, and the more nonnormal the distribution of the raw is, the larger sample sizes you’ll need.
Xbar control charts work the same way, so are much less affected by nonnormality. Individual charts, on the other hand, DO require normality of the raw data.
Other tests, such as tests of variances & standard deviations, tend to require a much larger sample size. In the past I’ve found that accurate estimation of the standard deviation can require sample sizes of, say, 100 or higher.
Other methods, such as process capability indices, rely on the distribution of the raw data rather than the means, so the Central Limit Theorem won’t help there. When working with process capability you really do need to assess the normality of the distribution. As Stan said, some characteristics are just not normally distributed – increased sample sizes do nothing to help that. That is where nonparametric or nonnormal methods come into play. For example, MINITAB provides a Weibull distributionbased capability tool (the Weibull is pretty flexible and can model many symmetric and nonsymmetric distributions). Other software uses Pearson Curves, which are a family of distributions that also is very flexible.
If you have large samples you can also use nonparametric percentiles to calculate capability indices, but that typically requires VERY big sampes (certainly 10,000 and more).
Another place that distribution assumptions become important is when doing reliability analyses. Typically you’ll try to find a distribution (scuh as Weibull or Lognormal) and its parameter estimates that accurately model your failure data. One of the tasks is finding a distribution that models the data well.
Sorry this was so long – its late on Friday PM and I’m just having fun.0November 30, 2001 at 10:09 pm #70303This is mostly for Ken… just sharing one of the fun insights that came my way.
Wheeler did an investigation into whether IMR charts need normality in order to work. His conclusion: Nope. He ran simulations using 1143 different nonnormal data sets, and found that they worked reasonably well, regardless of distribution shape. Yes, they do work just a little better if the data are normal, but not much. He wrote up his results in “Normality and the Process Behavior Chart”. Since I was originally taught you need normal data, it came as kind of a shock, but I read his stuff and have to believe his result, since the data are there to support it.
Thought you might enjoy. Have a great weekend.0December 1, 2001 at 5:40 am #70314Denton,Curious about your response to the efficiency of the IMR chart and the original question tied to statistical tests. Do you consider evaluating process data using a control chart a statistical test? If so, how is the test formed and what is the statistic that is computed? Perhaps Wheeler’s intention in evaluating 1143 distributions of varying shapes was to illustrate how robust the control chart is at providing an accurate signal of process change despite the shape of the distribution. At least, that’s how I read his work.Ken K.,My experience is that some data are naturally nonnormal. Examples are the timeofflight data used with catapult simulations, and many chemical analysis of organic and inorganic compounds. In many cases these data are best transformed by taking the natural of log base 10 of the individual values. In ANOVA tests you are determining differences between means by comparing the ratio of differences between treatment groups to differences within a treatment group. This comparison allows you to estimate an Fvalue, a statistic that is sensitive to how well the errors within a given treatment group fit a normal distribution.Most of my training has supported the above understanding for some time. However, I may be missing something here.
0December 2, 2001 at 4:30 am #70322Ken–
You know the stereotype of two Rabbis arguing over points of doctrine? My friend Bill and I used to do that on points of statistics…. miss him now that he has left Iomega. This is about as close as I’ve come… sure enjoy the interchange.
Don and I have discussed his work at length, and his intent is to show that normality almost does not matter in an IMR chart. The limits were set with an assumption of normality, but it turns out that they work almost as well with any distribution you can imagine. Bill and I both had to change our thinking on that one. Bill came close to getting into a debate with Don in front of a class over that one…
The ANOVA case is also interesting. When you assign a T and a P value, you are making an assumption of normality. When you compute an F ratio, you are not. The F ratio you get in ANOVA is purely signal to noise, and is independent of distribution. You are comparing Mean Square Within to Mean Square between, and those are defined for all distributions. Now if you want to perform an F test on the F ratio, you have a different kettle of fish.
Yes, I consider IMR charts a statistical procedure. Your statistics are the mean and the limits, and your result is to know whether your process conforms to the rules or not. I think it qualifies.0December 2, 2001 at 7:00 pm #70327
Ken MyersParticipant@KenMyers Include @KenMyers in your post and this person will
be notified via email.Denton,You and I have had discussions of point in past postings on this site. I too have enjoyed the interchange. You sound like a very knowledgeable guy, and your guidance is usually accurate. However, between the two of us a midcourse correction concerning the points raised in this thread might be helpful. I’m not too fluent in religious doctrine, nor am I familiar with the two Rabbis arguing over such. So I’ll try to stick to the logic of your answer to the original question posed by “anonymous” on the need for normally distributed data in performing a statistical analysis or test.Your answer on the 30Nov attempted to use an IMR chart and Wheeler’s work to illustrate that normality is not a requirement for many tests of statistics. At least that was my understanding from example. First, I completely agree that the effectiveness of IMR charts are NOT based on the distribution of the data. This has been a clear understanding of mine for over 20 years, and is not disputed. However, I take some exception to the inductive method you used to show that the normality of data and/or the errors is not needed because of Wheeler’s work with 1143 showing the detection efficiency of IMR charts. This reasoning and the basis for its conclusion appears to be flawed on two points:1) Control charts are NOT formal tests of statistics. References: “Out of the Crisis”, Deming, pp. 334335, “Advanced Topics in Statistical Process Control”, Wheeler, pp. 1617.2) Drawing a general understanding of statistical theory from a specific observation, (use of IMR charts), through the inductive process. This is not the usual way of making a logical conclusion on general observations.Conclusion: Control charting of process data is not a statistical test. As such, it is not appropiate to use it to show any relationship between statistical test methodology and the assumptions and requirements of the distribution of population and/or experimental errors of the tested data.On the question of whether an ANOVA requires normally distributed data. Fine point here as I tried to elude in my earlier post–the errors need to distribute normally… In fact, the general assumptions for Analysis of Variance(paraphrased) are as follows:1) Inputs effects are independent from each other.
2) Outputs are independent from each other.
3) Means and variances are additive.
4) Experimental errors are independent.
5) Homogeneity of variances for all input effects.
6) Distribution of experimental errors is normal.Reference: “Statistics Manual”, E.L. Crow, F.A. Davis, and M.W. Maxfield, pp. 119120.Item six above was my earlier point. While the populations may not need to be normally distributed, possibly your point, the errors do need to be independent and distributed normally. This means to effectively use ANOVA to identify the KPIV’s in a system(Type II test) or make comparisions between means(Type I test), a formal residuals evaluation of the errors is necessary. This again, is a fine point, but nontheless an important requirement of the test.Concerning an Ftest of variances, the following assumptions apply:1) The populations have normal distributions
2) The samples are random samples drawn independently from the two populations.Same Reference as above: pp. 7475.Item 1 above is the key point.Conclusion: The Ftest of variances DOES require normally distributed data. Therefore, to effectively perform this test a mathematical transformation of the individuals to insure the data closely distribute normally may be necessary.No, I’m not a Statistician! No, I’m not one of these anal retentive guys who needs to be right all the time… I’m just a guy who works hard to insure my guidance is as accurate as it can be. In my past work, I’ve gotten myself and others in more trouble than I can say using the same inductive reasoning above. This is as much a human endeavor as it is an intellectual one. I sure hope I did not offend those Rabbis! Take care.Ken Myers0December 2, 2001 at 10:16 pm #70330Ken, I’m sitting here a little crosseyed from lack of sleep… don’t know if I have fully assimilated all that you have said.
The big distinction I draw is between doing an F test, where you assign a P value, and creating an F ratio. My statement is that the F ratio is signal to noise, with or without the assumption of normality. Wheeler develops that idea in Understanding Industrial Experimentation, page 96. You can’t assign a P value for an F ratio of 100:1 without assuming a distribution, but if you’re happy with 100:1 and no P value, you don’t have to make that assumption.
My original point on normality was a little simpler than all that… there are times, such as in IMR charts, signal/noise ratios, and T tests that normality doesn’t matter much at all. Of course, there are other times that it matters a lot. I was simply saying that there is no need for us to get totally tied up in our socks by insisting on normality in all cases… especially since practically all realworld distributions can be shown to be nonnormal, if you take enough data.
Now if Wheeler is saying that Control Charts aren’t statistical tests, that’s interesting…. I hadn’t thought of them that way, which is the fun part.
Normal residuals: I have generally taken them to be evidence of having met the other assumptions, rather than as a starting requirement…. have to think about that again. My view is that if your model sucks all the information out of the data, all that’s left is random noise, which tends to be normally distributed.
The reference to the Rabbis was just me laughing at myself and Bill… neither of us are Jewish, but there is this stereotype of Rabbis recreationally arguing very fine points of doctrine. One day, I realized that he and I were doing the same thing with statistics, and sort of enjoyed the mental image.0January 25, 2002 at 2:51 am #71522
Stan AlekmanMember@StanAlekman Include @StanAlekman in your post and this person will
be notified via email.Control charts for individuals do not have the Central Limit theorem to provide normality because individual and not average of small groups are used. Must individuals control charts have an underlying normal distribution of the charted variable?
The formulas for calculating control limits are based on betweengroup variance, assumes that within group variance is zero or very small. Are these formulas based on distributional assumptions? I am not aware of the derivation of these formula. Are there good references to this?
StanAlekman0January 25, 2002 at 2:53 am #71523
Stan AlekmanMember@StanAlekman Include @StanAlekman in your post and this person will
be notified via email.Can you provide a reference to Wheeler’s article on his findings regarding normality requirements?
thank you
stan alekman0January 25, 2002 at 1:06 pm #71533
Dave StrouseParticipant@DaveStrouse Include @DaveStrouse in your post and this person will
be notified via email.Stan –
You had posted another messgae earlier this week where you made the same assertions that the control chart was based on between group variation and an asumption of within group variance being zero.
Please stop propagating this error. You are incorrect in these statements. If you have references that say this, please provide them so that we can all learn this wonderfull new way to look at the world.
All control charts are based on capturing the within subgroup variation as common cause. This is the concept of rational subgrouping. Once this is established, any between groups variation outside of that expected based on the normal within group variation is seen as special cause variation.
Plkease take the time to learn a little about SPC so that you can be more knowledgable. Here is a suggested reading list
The best book I have yet seen on practical SPC is “Understanding Statistical Process Control” Wheeler and Chambers. Chapter nine has a section on where the control chart factors come from.
“Economic Control of Quality of Manufactured Product” W. Shewhart. This is the original work by the master. I don’t have a personal; copy, but I know he goes into detail as to how the limit factors were derived.
“Process Quality Control – Troubleshooting and Interpresattion of Data” Ott, Schilling & Neubauer
Grant and Levenworth and Douglas Montgomery also have standard works on this subject. The AIAG manual on SPC is also helpfull.
0June 4, 2003 at 5:35 pm #86667
John CarderParticipant@JohnCarder Include @JohnCarder in your post and this person will
be notified via email.Hmm. Not sure if this discussion is still alive, but it’s on the front page, and an important topic. Also, I’m not sure if this discussion is supposed to be the ‘definitive’ guide to the question, (plenty of resources on the web for discussions on normality, including here) but, for what it’s worth, here’s some thoughts on the message I read.
With regards to the last posts, it’s a big confusing. One person seems to be talking about control charts for ‘individuals’ (single data points) and another about ‘within subgroup variation’. By definition, there can be no variation within a point represented on an ‘individuals’ chart, since it represents just that – a single measure. What we are looking for on an individuals chart is:
1. Is the data distributed normally? (Use Minitab or your favourite tool to check, plus also just take a look at the raw data – a dotplot is a good place to start. Check for any extreme points (‘outliers’) and make sure that they are not, for example, measurement errors. You did do R&R before measuring, right?)
2. If the data is ‘normal’, (and plenty of times it is not, do some reading), then three things interest us:
(i) Where the process is centered – that’s what’s you will get, on average. (But be careful, on some control charts the centre line is the median, i.e. with 50% of the values measured on one side, 50% on the other. Otherwise the middle line is the average, a good guide to ‘where your process is aiming’).
(ii) The next thing to look at is not, I suggest, for signals that the process is ‘out of control’ (sometimes called ‘special causes’), but rather, what are the values of the ‘control limits’, (the red lines if you are using Minitab, which a lot of people seem to…including me). How were they calculated? What does that mean for your process? How do they compare to customer spec.? (But let’s not get into process capability here…)
(iii). Finally I get back to the topic – normality. Data which is not distributed according to Gauss’s, (or Laplace’s if you are Francofile), famous curve will give BOTH a false calculation of your control limits (on an individuals chart) and false ‘signals’ of specials cause variation. Hence, need to check for normality when doing individuals charts… These control limits show what is assumed to be the range of ‘usual’ variation in your process. Of course, since these are based (on individuals charts) on normal theory, the closer the point is to the control limit, the less likely it is to have been created by your standard ‘under control’ process. Hence once you get a point, say, more than three sigma away from your centre line, it is assumed to be highly improbable that the point was created by your standard process. Again, if your data is not ‘normal’, then this test just does not work…(By the way, please do not forget the other tests for ‘special’ causes, 14 points up and down, etc. Note that Minitab does not activate this by default). Also, for highlycritical applications, consider using 2Sigma control limits. You’ll get more ‘false positives’, but in, say, the medical world, you’ll also get less dead patients…
As for ‘between group’ and ‘within group’ variation, I agree with the last post, (musical reference?), in that, when measuring highvolume processes, or where sampling is costly, it certainly is worthwhile sampling in rational subgroups. But whilst it is true that we hope that the variation within each subgroup respresent all the ‘common cause’ variation, and none of the ‘special cause’, please do not take this as fact. Look at the two types of variation – does one group have a significantly higher variation between the, say, five points measured? If so, why? Go check – measurement error, transcription error, or enemy action? If all OK, then, and only then, read the ‘upper’ chart showing ‘between group’ variation. Read as per an individuals chart, really.
Finally, as to the point as per normal distribution of the residuals, most often seems to happen when people are trying to fit a straight (linear) regression line to data where in fact the process is better explained by a nonlinear model.
John
0June 4, 2003 at 9:18 pm #86672
GabrielParticipant@Gabriel Include @Gabriel in your post and this person will
be notified via email.Some comments:
1. I am not very sure that we should allways care too much about normality when using individual charts. After all, R, S and P are clearly not normal but when we chart them we use ±3sigma limits as if they were normal. If we should care for individuals, we should care for them too. And, furthe more, we should care for Xbar too. The central limit theorem does not state that thdistribution of the averages is normal, but that it tends to be when the sample size increase. And subgroups of 2 or 3 may not be enough if the individuals are pretty nonnormal. Also, d2 (used for the Xbar control limits calculation) assumes a normally distributed process.
2. (i) The average of the medians of the subgroups is much closer to the process average than to the process median. So the median’s average is an estimator of the process average, not the process median. But, if the process is normal as you assumed for this point, then all this is useless since the median of a normal distribution equals its average.
(iii) “Data which is not distributed according to Gauss’s (…) curve will give BOTH a false calculation of your control limits (on an individuals chart) and false ‘signals’ of specials cause variation. Hence, need to check for normality when doing individuals charts…” Please, tell us what a “true” control limit is, so we can derive what a false control limit is. We know what a false signal is. It’s a signal due to chance in a stable process (not due to variation from a special cause). All charts will give you false signals from time to time even if the process is normal. Further more, there are several distributions that have no individuals beyond ±3 sigmas, and for them you will get no false “point beyond control limits”. The uniform and triangular are just two examples of those distributions. The normal is not one of them (0.27% of the indivifduals are beyond those limits).0June 4, 2003 at 10:13 pm #86673
TierradentroParticipant@john Include @john in your post and this person will
be notified via email.Hi Gabriel
So that’s two of us with too much time on our hands, eh?
Thanks for your reply – do not disagree with anything, see it more as a clarification. That’s fine, since my post was intended to be more of a base than a stats class… Reading some of the posts, seemed to me that that is what is appropriate. For ex. discussion of median vs. mean – most of the people I coach don’t even look!
As for ‘correct’ control limits, well, as you know nothing in stats. is certain, as Wheeler said, ‘all models are wrong, but some are useful’. There are different ways of calculating limits, (based on R tilde and bar, different ways of calculating sigma for simpler models based on just such as +/ 3 Sigma). What’s the ‘right’ one, you ask? Well, for me, the one where the chances of detecting true process deviation (shift) with such limits greater than detecting a false positive…and as you point out, that depends upon your data distribution. Still, a topic for another forum, methinks…normal distributions, why does it matter? Because it’s the start point for 99% of the people we have to try and help.0 
AuthorPosts
The forum ‘General’ is closed to new topics and replies.