Did you ever finish the table you were working on? I have a problem that deals with unequal variances and skewed data. I am limited by the software we are using so a non parametric test is not an option and I need to know if ANOVA is still ok to use.

Really hope you get this,

William

]]>Happy to help, will make contact off-line to better understand the issues.

regards

Robin

I too have been struggling in application of the statistical tools to the analysis of data using the six sigma rigour.

I would appreciate any help in getting good references in the practical approach to data analysis using Statistical Analysis.

I also do not wish to be a bookworm and just reading books and gaining no experience. I have undergone six sigma trainings but have yet not been able to use the tools appropriately.

Any help would be highly appreciated to clear my ambiguity.

Regards.

]]>Thanks for the link, good article.

In the end I have written a small training course for our Black Belts that I presented at our lunch and learn sessions. This has helped them understand and deal with non-normal data in an environment where non-normality is the norm.

Regards

Robin

Handling Non-Normal Data By Shree Padnis https://www.isixsigma.com/library/content/c020121a.asp

It helped me understand my data more. You may want to set a lower spec limit or upper spec limit. In practical terms, my data can’t be above or below this. It’s a good way to eliminate outliers which may have been wrong from the get go.

The article also gives you the options of transformations which will make your data normal. If you use Minitab, the Individual Distribution Identification is great!

Good luck!

]]>Regards Robin ]]>

Skewness

Kurtosis 0 1 2

-1.5 5.36

0 ** 5.00 ** 5.1 5.2

2 4.52 4.62 4.72

I tried retyping it above. If this doesn’t work go over to the discussion forum and under search discussion forum type in "robert butler non normal" and sort by date. The message is #6 and is titled Re: ANOVA. I’m sorry for cluttering up your comments section like this.

]]>The best short discussion of this question of which I’m aware can be found on pages 48-56 of The Design and Analysis of Industrial Experiments by Davies-2nd Edition.

The text is too long to quote here but the short version is this – if we use skewness and kurtosis to define what we mean by departures from normality (remembering that an ideal normal distribution has skewness and kurtosis = 0) and if we ask the question of how much these departures impact our level of significance (that is we wish to test for P < .05 but what are we really testing when the data isn’t normal) then Table 2A.3 on pp. 56 gives the following:

For an Analysis of Variance of 5 groups of 5 observations each the true percentage probabilities associated with 5% normal theory significance are

Skewness

Kurtosis 0 1 2

-1.5 5.36

0 5.00 5.1 5.2

2 4.52 4.62 4.72

The bolded 5 is the P < .05.

So, worst case (that is worst case in terms of overall shape) with a 2,2 instead of a 0,0 for skewness and kurtosis your P < .05 would actually be a P < .0472. In short, even extreme non-Normality has little se

]]>Thanks for the feedback, couldn’t figure out how the access the message ID. It seemed too far back to access? Could you send link please?

Ron

Good example, problem I have is translating medians back to business-speak. Customers tend to work with averages and their eyes start to shut at the mention of medians. But I focus on the outliers and talk about how eliminating these will have the biggest impact on improving the average.

Agree with the do-all-tests approach. I like to be absolutely sure (95% confidence?) on results because of the danger of making type I & II errors. Calling something and being wrong is embarrassing!

What I am doing is pushing the academic side and have already got a few wins. For example found an immediate application for the Spearman Ranking Coefficient.

Please publish the link to the blog when you write it.

]]>Here is my take on it. If we go back to the basics… like why we generally prefer to use the median over the mean with non normal data I always come back to the "house price" example.

In most American subdivisions we come across a few very expensive homes… these homes can skew the "mean home price" to the right while the "median home price" is a little more left (and more realistic).

This is at the heart, in my opinion, of why we need to be careful when testing non normal data. The "expensive" homes can skew the results moving us away from reality.

Another option is this… don’t worry too much about it and do both a parametric AND non parametric test when faced with non normal data. It is just a few more clicks in Minitab and you can share the results of both tests proving how diligent you are! I find in most cases the results are the same anyhow.

Of course you wont read this in some stuffy stats book… but this less dogmatic approach has served me well.

Plus, ask 4 different statisticians this question and you will likely hear 4 different things. And since many of them never leave the classroom (or hotel conference room where they teach others) let’s go to the gemba and learn for ourselves. That’s my take on it at least.

Perhaps I will blog about this in the future.

]]>To your table – look up two sample t-test with unequal sample sizes and unequal variances. As for the issues surrounding non-normality look up message ID 110215 over in the discussion forum.

]]>