# Confidence Intervals for Non-Normal data

Six Sigma – iSixSigma › Forums › Old Forums › General › Confidence Intervals for Non-Normal data

- This topic has 53 replies, 18 voices, and was last updated 11 years, 11 months ago by SixSigmaGuy.

- AuthorPosts
- February 5, 2008 at 7:14 pm #49282

Bob HubbardParticipant@Bob-Hubbard**Include @Bob-Hubbard in your post and this person will**

be notified via email.I am working with time data relating to the Time Required to Resolve Problems. Most problems are solved in less than an hour, skewing the distribution to the left. Some incidents require many hours to repair, which is why this data is not well modeled by a Normal distribution. (Even after accounting for Special Cause, I am left with Non-Normal data.)

How do I determine Confidence Intervals on this Non-Normal data without relying on the Central Limit Theorem?

Thanks for your input!

Bob [email protected]0February 5, 2008 at 7:30 pm #168213

SixSigmaGuyParticipant@SixSigmaGuy**Include @SixSigmaGuy in your post and this person will**

be notified via email.Do you mean the confidence interval around the mean? I, too, frequently have the same problem and am really glad you asked this question. When I plot a histogram of my data, it typically looks like a Chi-Square distribution with a low number of degrees of freedom. But I don’t know how to calculate a confidence interval around the mean for a Chi-Square distribution. Hopefully, someone here does and can solve this problem for us.

0February 6, 2008 at 1:37 pm #168254BH,

Have you tried the following:

As you do not seem to have Normal data have you tried using the median instead of the mean as they use in the housing industry?

Have you looked at converting your Non-Normal date using Box Cox transformation?

Scooter

0February 6, 2008 at 2:36 pm #168256I’m no expert but have you considered non parametric tests – e.g. Mann-Whitney. This basically compares the cumulative distributions with each other – regardless of their fit to any standard distribution.

0February 6, 2008 at 2:41 pm #168257Disregard last post – wrong topic!

0February 6, 2008 at 3:25 pm #168260

Dennis CraggsParticipant@Dennis-Craggs**Include @Dennis-Craggs in your post and this person will**

be notified via email.What distribution describes the data? Have you tried Weibull or LogNormal?

0February 6, 2008 at 5:19 pm #168263A minor point, but this type of data is referred to as skewed right, not skewed left. The direction refers to the long tail.

If your data is continuous and relatively smooth, you can discover the correct distribution using Minitab allowing you to adequately describe the confidence interval.

If it is continuous and relatively smooth you can also transform it successfully if you just have a need to talk of it in terms of a normal distribution (NVA).

0February 6, 2008 at 6:27 pm #168265

SixSigmaGuyParticipant@SixSigmaGuy**Include @SixSigmaGuy in your post and this person will**

be notified via email.Wow!! Live and Learn. I’ve been doing this for many years and have always interpreted Skewed Left and Skewed Right based on the peak of the distribution, not the tail When I first read your post, what is wrong with this person. But then I looked it up in a few statistics books and, OMG, you are right. I’ve even taught it wrong in classes; how humilating. :-( A more precise definition (as I found in one of my stat books) is, if the mean and median are to the left of the mode, the distribution is Skewed Left (also called “Negatively Skewed”) and if the mean and median are to the right of the mode, the distribution is Skewed Right (also called “Postively Skewed”).

Thank you for pointing out the error in my ways. :-)0February 6, 2008 at 6:34 pm #168267

Mike CarnellParticipant@Mike-Carnell**Include @Mike-Carnell in your post and this person will**

be notified via email.SSG,

Simple way to remember:

Make a fist with your thumb sticking out. Have your thumb point the direction of the tail. If your thumb is pointing right it is skewed right. I let you figure out what it is if your thumb points left.

Good luck0February 6, 2008 at 6:51 pm #168270

SixSigmaGuyParticipant@SixSigmaGuy**Include @SixSigmaGuy in your post and this person will**

be notified via email.Not sure what Bob’s requirements are, but in our case, we are looking for something more rigorous than Mann-Whitney test (or it’s equivalent, the Wilcoxon Rank-Sum test); something that will increase our likihood of rejection.

0February 6, 2008 at 6:53 pm #168271

SixSigmaGuyParticipant@SixSigmaGuy**Include @SixSigmaGuy in your post and this person will**

be notified via email.:-)

0February 6, 2008 at 6:56 pm #168272

SixSigmaGuyParticipant@SixSigmaGuy**Include @SixSigmaGuy in your post and this person will**

be notified via email.Not sure what Bob’s distribution is, but from his description it sounds the same as mine. Mine looks most like a Chi-Square distribution when the DF is low.

0February 6, 2008 at 10:49 pm #168286I’m looking for a way to say, “we can say with 95% confidence, that

our 2007 numbers are a statistical improvement over our 2006

numbers”. I haven’t tried Mann-Whitney yet, because I’m looking for

something simple to execute and something I can explain to non-

statisticians.0February 6, 2008 at 11:01 pm #168287Thanks for the clarification on the direction of my skew. ;-)My data is continuous, but there are tons of data points below the

60 minute mark. (Probably because there is an informal Service

Level Agreement at the 60 minute mark.) Most of these problems are solved in an acceptable amount of

time, (the mean is around 120 minutes over +2,500 events.), but

at some point, everyone agrees that we should take, “as long as it

takes”, to fix the problem. My gut tells me that this becomes a

different process at this point, but I’m reticent to toss the data

points. If I make some broad assumptions, remove 20 or 30 data points,

my data fits Weibull and Log normal distributions.0February 7, 2008 at 12:39 am #168288SixSigmaGuy is correct, my distribution looks roughly like a Chi

Square distribution with 9 or so degrees of freedom.0February 7, 2008 at 12:43 am #168289I did do a Box Cox transformation, but it looked a little too

“scrubbed”. I like the idea of using the median, rather than the mean.

How do I calculate the CIs and is hypothesis testing the same with the

median as with the mean?0February 7, 2008 at 12:46 am #168290Love the fist analogy. Will definitely use that in my next Green Belt

class, THANKS!0February 7, 2008 at 2:51 am #168292

HF ChrisParticipant@HF-Chris**Include @HF-Chris in your post and this person will**

be notified via email.Why would you remove 20 to 30 points to make things fit better? Can you really say you have 95% confidence with that process? Suggestion is to run a goodness of fit test with multiple types of distributions. If no fit, re-examine your process. There are transformations that may work for a fit test, but what is that really going to do for you if your goal is to make the statistics look simple to your audience? Show your people the real picture and build a business plan to go fix it.

Chris

0February 7, 2008 at 4:53 am #168293

SixSigmaGuyParticipant@SixSigmaGuy**Include @SixSigmaGuy in your post and this person will**

be notified via email.I’ve been doing some reading on the subject, and have an idea I’d like to propose as a solution to this problem.

Assuming that the data does fit a Chi-Square distribution and I want to build a 95% confidence interval around the mean of my sample. Using both tails of the Chi-Square distribution, I can calculate the (1.0 – .95)/2 = 0.025 critical values separately for each tail; the same way I calculate the critical values when I’m trying to estimate the standard deviation of a normal distribution. I’ll call the left critical value a(L) and the right critical value a(R).

The properties of a Chi-Square distribution are (from http://stattrek.com/Lesson3/ChiSquare.aspx?Tutorial=AP): m = mean = degrees of freedom s = standard deviation = SQRT(2 * m)

Thus, using the same procedure as used for the z and t distributions, I can calculate the margin of error as two values: E(L) = a(L) * s = a(L) * SQRT(2 * m) E(R) = a(R) * s = a(R) * SQRT(2 * m)

Finally, the confidence interval would be [x-bar – E(L)] to [x-bar + E(R)], where x-bar is calculated the same way as for a normal distribution.

Does that sound like a valid approach? Does my math make sense?

0February 7, 2008 at 1:55 pm #168306I’m confused as to why you would consider a chi-square when working with continuous cycle time data? I’ve only ever used chi-square to handle proportion data.I have done several cycle time analyses before, and have found that Mood’s Median test is the hypothesis test to use. No need to transform, etc.

0February 7, 2008 at 3:31 pm #168309

Dennis CraggsParticipant@Dennis-Craggs**Include @Dennis-Craggs in your post and this person will**

be notified via email.Actually the type of distribution is very important. Since the data is skewed it suggests that Weibull, LogNormal, or Exponential will provide a reasonable fit to the data. Once a distribution is fit to the data, then confidence limits on the parameters of the distribution can be established. Also, the population % that will fall into intervals can be predicted. The distribution may allow the data (and specifications) to be normalized. This will allow the calculation of indices like Ppk or Cpk.

Minitab can be used to establish the type of distribution. Weibull and LogNormal were referenced since they are usually provide reasonable fits to the data.0February 7, 2008 at 8:17 pm #168327

SixSigmaGuyParticipant@SixSigmaGuy**Include @SixSigmaGuy in your post and this person will**

be notified via email.If I understand Bob’s project correctly, the goal of the project is to lower the mean; thus, he should be performing hypothesis tests around the mean, not the median. At least that’s the case with my project and it sounds similar to Bob’s.

The reason for considering the chi-square distribution, in my case, is because the histogram for my data resembles a chi-square distribution. Also,My distribution is not symmetric, but skewed right.

There are no negative values in my data; thus the values are bounded by zero.

These are both characteristics of the Chi-Square distribution.

It’s true that my distribution might not have the same shape as the Chi-Square distribution, but, if it doesn’t, I need to determine what distribution it does match. Then, I’ll have the same problem of how to calculate a confidence interval for that distribution.0February 7, 2008 at 10:07 pm #168332Is your objective simply to determine a reliable estimate of central tendency and corresponding CI for a given data set that doesnt approximate normal?

0February 7, 2008 at 10:11 pm #168333

SixSigmaGuyParticipant@SixSigmaGuy**Include @SixSigmaGuy in your post and this person will**

be notified via email.Yes, that’s a good way to put it, although the central tendency metric needs to be the mean, x-bar.

0February 7, 2008 at 10:15 pm #168334Forgive my ignorance, but what is so important about using the mean?

0February 7, 2008 at 10:52 pm #168335

SixSigmaGuyParticipant@SixSigmaGuy**Include @SixSigmaGuy in your post and this person will**

be notified via email.It’s the metric we are trying to improve; that’s the only reason. We have a metric named MTTC, Mean Time To Close, that we use for tracking support calls. Currently our MTTC is too high and we want to reduce it. Thus, we are doing a Six Sigma project to accomplish this. I could use other metrics, such as Median or Variance, but the results wouldn’t mean much to our sponsors because it doesn’t tell them that the mean has been reduced.

0February 7, 2008 at 11:58 pm #168338

HF ChrisParticipant@HF-Chris**Include @HF-Chris in your post and this person will**

be notified via email.Reminds me of the joke about the two statisticians who went hunting. The first hunter shot 2 inches to the left of the deer. The second hunter shot 2 inches to the right of the deer and both of them were happy. Why you ask, because by averaging the distance of both shots they got the deer.

Point is, if your managers only understand changes in the mean they don’t understand their true process problem.

HF Chris0February 8, 2008 at 12:12 am #168339

BrandonParticipant@Brandon**Include @Brandon in your post and this person will**

be notified via email.HF, that’s a lot like what we used to call the difference between a stats guy and an engineer…

If a naked lady stood against a wall and you could only move towards her 50% of the distance with each step..the stats guy say “Forget it…you’ll never get there!” The engineer said “I’ll get close enough…move over!”0February 8, 2008 at 12:17 am #168340

TaylorParticipant@Chad-Vader**Include @Chad-Vader in your post and this person will**

be notified via email.brandon, someone once said close only counts in horseshoes and handgrinades, but I think you added to the list

0February 8, 2008 at 12:39 am #168341

SixSigmaGuyParticipant@SixSigmaGuy**Include @SixSigmaGuy in your post and this person will**

be notified via email.Whoever said my managers only understand changes in the mean? My managers understand statistics very well. The problem is the metric we are trying to improve is the mean. If we were trying to improve the median or the variability, we’d be after a different metric; but, in this case, we are trying to improve the mean.

No one yet has tried to answer my question. All I’ve heard are alternative approaches and/or workarounds. I’m not looking for alternative approaches or workarounds; I’m looking for a way to calculate the CI around the mean of a non-normal distribution. Doesn’t anyone know how to do this? I would have expected it to be a very common problem, especially for Six Sigma practitioners who are trying to reduce cycle times. Hasn’t anyone had to deal with this problem before?0February 8, 2008 at 12:48 am #168342Short answer: use Minitab.

0February 8, 2008 at 12:55 am #168343

SixSigmaGuyParticipant@SixSigmaGuy**Include @SixSigmaGuy in your post and this person will**

be notified via email.Minitab won’t help if I don’t know what method to use. As far as I can tell, Minitab doesn’t deal with my issue. I would love it if someone could point me to something in Minitab that will help me with my problem.

0February 8, 2008 at 1:48 am #168345Just use Graphical Analysis – it will take your data and give you confidence intervals for mean, median and std dev.

0February 8, 2008 at 2:17 am #168347Forgive me if this sounds pedantic, but I am simply trying to explain what I think is the proper response to your dilemma.

Any stable process will approximate a known distribution. Based on earlier comments, is it possible that your data set (s) is/are not stable, hence the difficulty in determining distribution type (ie ‘throwing out data points, etc)? Have you checked? You realize if your data set indicates instability then a CI for any descriptor (ie mean) is not valid?

And distribution type is relevent in that it allows the investigator to choose the descriptor that best represents their data set (in this case, central tendency). Project Ys that focus on time are generally bounded at zero and thus, skewed to the right…this is what you would expect. Hence, the median is a better descriptor for this data set and thus, a non-parametic test is an option here.

If your client is using a time-based performance metric from a stable process that plots as expected (pos skw), then using the mean and not the median is simply not representative of the process. Verify this statement through your statistical SME and bring this to the attention of your client and have them explain or defend this practice…you might even increase your credability with them.

Now the good news…..Practically, if what you require is a test of means (ie ANOVA, paired T, one sample T/Z, two sample T/Z, etc) these tests are largely robust to a departure from normality and using them would be appropriate for what you have described (Bob H). There is a difference between detecting non-normality (you have) and being sensitive to it (tests of means are not).

In addition, you can invoke normality in a data set (but ask yourself why) through through subgrouping (ie exploiting the CLT) or transformation once the distribution type and the resulting optimal lamda is determined .

Hope it helps. Trust but verify!0February 8, 2008 at 4:04 am #168349

HF ChrisParticipant@HF-Chris**Include @HF-Chris in your post and this person will**

be notified via email.Sorry if this posts twice….my first one seems to have got lost in space.

SixSigmaGuy,

In an effort not to aggravate Stevo with “in my last company”, here it goes. I was in your position to reduce cycle time. I did deal with managers who reduced product production schedule by 18 days to meet new part orders. I did have managers whose metric goal was to reduce the mean despite the spread and variation. In other words the current “late” process was in controlled “chaos”. The bosses only concern was remove 18 days on average with confidence.

Yes, you can run an ANOVA with conservative measures that take into account your “Variation”. When you run an ANOVA you need to make sure you understand the assumptions (found in a basic stats book). Yes you can see possible improvement in the mean….But, I hope your not missing my point. There are rules on how to handle outlier points (removing them completely is not the way). Also what happens to the products that don’t “fit” your model? Do they also get removed from service? If your average metric gets better will the customer be any happier if their product is not delivered at the prescribed time?

Run a goodness of fit test and look for the best transformation of your original data. Not knowing your data, a basic stats book has plots to guide your decision. If comparing two data stats use the same transformation (apples compared to apples). Keep it mind, that your products will continue to run with variation which I assume have an impact on another part of your schedule.

HF Chris0February 8, 2008 at 4:24 am #168350

SeverinoParticipant@Jsev607**Include @Jsev607 in your post and this person will**

be notified via email.Let me start by saying that I am no statistician, so you may call me an idiot by the time you’re done reading this. My ego can accept that. While I find the question intriguing and am also intersted in the answer, is it possible that you are too caught up in the minutia of the calculations instead of focusing on the improvement?

If your goal is to reduce your mean time to close and to show that you have made a significant, robust, sustainable change to your process why not just begin by generating a control chart? Their whole purpose is to detect shifts in the mean and due to Chebyshev’s inequality they are relatively insensitive to the actual underlying distribution when three sigma limits are chosen.

Although it sounds strange, you might be able to use the Winsorized mean to calculate your control limits so that they are sufficiently narrow and then use those limits to monitor your actual mean. If you can get your mean to fall below the LCL then you know you’ve made a significant improvement in your process.

Keep in mind also that as my understanding goes, using the mean as a measurement of central tendency for a distribution that is skewed can be dangerous. For example, the way to gain the most significant shift in the mean is to manage that data which shows up in the long tail, but by doing that you would be improving your process for only a very small proportion of the population rather than making a significant improvement for a majority of the population. Therefore, in my opinion it would be very very advisable to disqualify the outliers.

Note that I said disqualify rather than ignore. For example, if this is a call center you can develop criteria along with management for not including the data from a call in the metric. If this is done beforehand and you actually develop a new baseline with such a system in place you may actually find your data more closely approximated by a distribution whose properties are well understood. You could appease the management team by developing a seperate metric for those datapoints which were disqualified so that they may be improved at a later date.

Although none of this is as sexy as being able to calculate a confidence interval and perform a hypothesis test to show that you have made a significant shift in the mean, it does allow you to focus more of your efforts on the improvement activity rather than getting caught up in the details of calculation. Anyway, it’s probably a stupid suggestion…0February 8, 2008 at 12:42 pm #168356SixSigmaGuy,

Get a copy of “Statistical Intervals: A Guide for Practitioners (Wiley Series in Probability and Statistics),Hahn and Meeker. Thye most likely cover this. Don’t have my copy here, but I recall they do discuss this.

Another way is to bootstrap it. Resampling techniques like the bootstrap and jackknife are distribution independent.

0February 8, 2008 at 2:32 pm #168363Great example of a statistical wrong thinking . I’d like to add that the key to getting the deer is not statistical, but management of relevant Xs including people factors.People wise, the hunter must manage his emotions (by avoiding buck fever) and cultivate a passion for both continuous improvement and for bagging the deer.Technically, the hunter needs to maintain consistent technique in aiming and shooting, estimate windage, range, and deer movement, know what his rifle and ammo can and cannot do, protect himself from the environment so he can stay in the field long enough to get opportunities, minimize scent and noise, read sign and apply this intelligence, skillfully choose a worthy target in the first place, and be capable of taking fast a corrective action 2nd shot if needed. Sounds a lot like SS.

0February 8, 2008 at 2:57 pm #168370

Bob HubbardParticipant@Bob-Hubbard**Include @Bob-Hubbard in your post and this person will**

be notified via email.Dude, thanks for the insight. Its was most helpful.

I have been digging deeper since my original posting, and heres what Ive found. The data points far out in the tail seem to fall into three categories. 1. The problem repair team was not aware that something was broken, so could not begin fixing it. 2. The problem repair team and the client agreed to limp along until the problem could be fixed. 3. Special cause variation is to blame. Armed with this information, Im confident I can address some of the data points most distant from the mean. At that point I will be more confident that this process is stable.

We considered using the median, but in this case, the key performance metric is the average time to fix these problems, so wed really like to have a way to attack this from that standpoint.

I plowing through the sub-grouping now, and I really, really appreciate your suggested solutions!

Bob H0February 8, 2008 at 5:34 pm #168377

BrandonParticipant@Brandon**Include @Brandon in your post and this person will**

be notified via email.Or, as Ron White says, just hit the deer with your pickup at 55 mph. A lot easier…..and you don’t even have to set your beer down.

0February 8, 2008 at 8:55 pm #168382An Electrical Engineering joke:

If your thumb is pointing in the direction of the current, and the fingers wrap around in the direction of the magnetic field, then you know it’s your right hand.0February 8, 2008 at 10:35 pm #168385

SixSigmaGuyParticipant@SixSigmaGuy**Include @SixSigmaGuy in your post and this person will**

be notified via email.But those metrics will be calculated assuming the data is normally distributed; my data is far from normally distributed. Check out this paper on non-normal distributions here on ISixSigma: https://www.isixsigma.com/library/content/c020121a.asp.

0February 8, 2008 at 10:41 pm #168387

SixSigmaGuyParticipant@SixSigmaGuy**Include @SixSigmaGuy in your post and this person will**

be notified via email.I don’t want to do a transformation. The whole point of this thread is to find out how to calculate CIs without having to do any sort of transformation.

0February 8, 2008 at 10:44 pm #168388

SixSigmaGuyParticipant@SixSigmaGuy**Include @SixSigmaGuy in your post and this person will**

be notified via email.Thanks! I’ll check out that book.

Bootstrapping was one of the first things I tried. Still failed to reject. Nonetheless, the purpose of my question in this thread is to find out if there is a way to calculate confidence intervals for non-normal data directly from the data; similar to how it’s done for Normal data. Hopefully the book will answer the question for me.0February 8, 2008 at 11:02 pm #168389

Chebychev and Empirical Rules

I have used Chebychev’s rule to estimate conservatively ci for data that is not normal. See below copy paste info from http://www.stat.tamu.edu/stat30x/notes/node33.html

Hope this helps a little. bob

Knowing the mean and standard deviation of a sample or a population gives us a good idea of where most of the data values are because of the following two rules:‘s Rule The proportion of observations within k standard deviations of the mean, where , is at least , i.e., at least 75%, 89%, and 94% of the data are within 2, 3, and 4 standard deviations of the mean, respectively.

Empirical Rule If data follow a bell-shaped curve, then approximately 68%, 95%, and 99.7% of the data are within 1, 2, and 3 standard deviations of the mean, respectively.

EXAMPLE: A pharmaceutical company manufactures vitamin pills which contain an average of 507 grams of vitamin C with a standard deviation of 3 grams. Using Chebychev’s rule, we know that at leastor 75% of the vitamin pills are within k=2 standard deviations of the mean. That is, at least 75% of the vitamin pills will have between 501 and 513 grams of vitamin C, i.e.,

EXAMPLE: If the distribution of vitamin C amounts in the previous example is bell shaped, then we can get even more precise results by using the empirical rule. Under these conditions, approximately 68% of the vitamin pills have a vitamin C content in the interval [507-3,507+3]=[504,510], 95% are in the interval [507-2(3),507+2(3)]=[501,513], and 99.7% are in the interval [507-3(3),507+3(3)]=[498,516].

NOTE: Chebychev’s rule gives only a minimum proportion of observations which lie within k standard deviations of the mean.0February 8, 2008 at 11:19 pm #168390You bet! Best of luck!

0February 8, 2008 at 11:33 pm #168391

SixSigmaGuyParticipant@SixSigmaGuy**Include @SixSigmaGuy in your post and this person will**

be notified via email.Thanks for sharing! I gave it a try, but unfortunately, it widens the CI over what we get with a t-distribution. So, the liklihood of rejection is reduced.

I assume that for k=1, the result comes out to 0%, so it’s a pretty safe bet that the % is greater than 0%. :-)

This will certainly be handy in other situations I deal with, though. Thanks!0February 8, 2008 at 11:42 pm #168392

SixSigmaGuyParticipant@SixSigmaGuy**Include @SixSigmaGuy in your post and this person will**

be notified via email.I do, very much, appreciate your help. But my goal here is to find a way to calculate a confidence interval on non-normal data, regardless of whether it’s the right approach or not. I recall doing this in a statistics course I took in graduate school, so I know it can be done; I just don’t remember how and I can’t find anything that tells me how.

The data is stable according to general inspection and any statistical analysis I’ve done to verify that the process is in control.

That said, your post has some very good information that I will find helpful on future projects.

Thanks!0February 8, 2008 at 11:47 pm #168393Roger that. Good luck!

0February 8, 2008 at 11:47 pm #168394

SixSigmaGuyParticipant@SixSigmaGuy**Include @SixSigmaGuy in your post and this person will**

be notified via email.Glad you’re back Bob H. Sounds like our problems are very similar. If you come up with a solution that’s not posted here, would you mind sharing it? I got your email address from your first post; I’ll send you an email and we can discuss offline.

0February 9, 2008 at 1:08 am #168395

HF ChrisParticipant@HF-Chris**Include @HF-Chris in your post and this person will**

be notified via email.SixSigmaGuy,

Here is a walk through example using chi square. I don’t use this often because of my own personal bias to: 1.group data per their homogeneity if possible and compare like samples; 2. transform data if close; 3. Understand what in the system is causing this variance, because although all data may fit in their upper and lower limits, the process for business is unacceptable. However, in response to your request look at this link for non-normal data and the calculations.

/resourcesquality/wqachapter10.pdf0February 10, 2008 at 9:52 pm #168419Actually, k must be greater than 1 for this to apply. bob

0February 11, 2008 at 2:38 pm #168430I also seem to remember that in order to use the t-distribution, one of the assumptions for small sample size is that the parent population is normal so be careful there. bob

0February 12, 2008 at 2:49 am #168463

SixSigmaGuyParticipant@SixSigmaGuy**Include @SixSigmaGuy in your post and this person will**

be notified via email.Yes, that’s right. My statistics reference states that if the sample size is <= 30, then the population distribution has to be normal. Of course, as the sample size increases, so do the degrees of freedom; and as they increase, the distribution widens.

0 - AuthorPosts

The forum ‘General’ is closed to new topics and replies.