# Non-Normal Data

Viewing 10 posts - 1 through 10 (of 10 total)
• Author
Posts
• #53491

guinand
Member

Hello, I´ve been reading some topics about Non-normal Data, but a I still having some doubts…

1) If the distribution is Non-Normal always its necesary to transform the data to have a normal distribution???

2) if the answer is no, Can i continue applying the metholodogy with non normal data??? That depend on???

Thank you

0
#190368

Robert Butler
Participant

Your question is far too general. The answer is yes, no, maybe. If you can tell us more about what you are doing and what tests/methods you are interested in using then perhaps I or someone else could offer some suggestions.

Just to give you some understanding of why it is too general consider the following:

1. Data – both X and Y do not have to be normal for regression, however, in order for the tests of significance to be accurate the residuals must be normally distributed.

2. t-test and ANOVA are robust with respect to non-normality, however, you can have situations where the distributions are so extreme that these tests will declare non-significance when, in fact, the reverse might be true – how non-normal – good question – no real good answers – best defense – when in doubt run the Wilcoxon-Mann-Whitney or the Kruskal-Wallace tests.

etc.

0
#190369

guinand
Member

Robert, thank you for your help. you have answered part of my doubts.

At the moment i don’t have this kind of problems , but i have seen in other projects , for example when you use a regression, that you mentioned above , the most important thing its that the residuals have to be normal.

I think this kind of explanation its important for practicioners like me, because when we find a non normal data, we usually wants to transform the data inmediately and the are somes cases how you explained above its non necessary.

I have another question…. Maybe is too general

When you are in Analysis stage, and you are doing an hypothesis testing is important to explained the data with a test ( anova, regression, t test, etc) .

The question is:
Do you have to demostrate or show conclusions with a statistical analysis?

The question is because i have seen like people explained some kind of data only doing a pareto. is it correct?

Thank you

0
#190370

Robert Butler
Participant

Statistical analysis is not limited to formal statistical tests. Properly executed graphics and/or summary tables are a form of analysis and may be all that you need.

For example, if you are trying to explain your choice of problems to address then the pareto is one of the tools of choice. It is a clean graphic and it allows everyone to see your data in some kind of rank order of importance – and it is a statistical summary.

You could also use a pareto to to illustrate the effects of your actions – in this case it would be two paretos – a before and an after.

If we consider the world of graphs in general then there are cases on record where the graph was the primary analysis and was used, in conjunction with summary tables, to solve a problem – a good example of this is Snow’s graphical summary of the cholera epidemic in the Broadstreet area of London back in 1854. His graph, coupled with his summary of failed attempts to find any other meaningful relationship between those who died and those who survived in that area forced all of the governments of Europe to take steps to insure water quality (remember this was 30+ years before the germ theory of disease transmission was even put forward).

If you want to explore this a bit more I’d recommend reading the three books by Tufte

1. The Visual Display of Quantitative Information
2. Visual Explanations
3. Envisioning Information

0
#190371

guinand
Member

Robert, thank you for your explanation you have clarified all my doubts

0
#190375

Jones
Participant

Several months ago, Breyfogle and Pyzdek had quite an interesting debate on this very subject on Quality Digest’s website. I suugest you search the term “tranform data” at: http://www.qualitydigest.com/

0
#190376

guinand
Member

ok Thank you, I will check

0
#190377

guinand
Member

JHJ, interesting article ,
in your opinion What will you do? transforming or not?

0
#190381

Mikel
Member

To Zackaris on your original question.

1) The answer is no. It is anything but necessary to transform the data. There are very limited instances where transforming the data is even appropriate. Why transform the data where there is an expectation of a different distribution and that is what you find? If you are smart enough to know how to answer that question, then you are smart enough to apply the assumptions of that distribution and make a decision.
2) Yes you can apply the method which is DMAIC, not a specific tool. Doesn’t depend on anything. DMAIC basically is a thought process of understanding the behavior of a process before trying to fix it. Tools are just an aid in this. Use your brain and don’t just check boxes.

By the way that was Wheeler and Breyfogle and Wheeler is right. Breyfogle is just an academic practitioner who does not actually touch processes. Breyfogle is mumbo jumbo, Wheeler is pretty straight.

0
#190386

MBBinWI
Participant

Robert: As always, spot on. You may also wish to plot a run chart, or control chart with 2 (or more) groups to graphically show changes. And Tufte is the grand-master of visual data representation. Back in school when we were studying the Napoleonic campaign to Russia, the Tufte graph was a very powerful graphic. Those reading this who have no idea of what I’m talking about should Google Tufte and Napoleon. Truly amazing.

0
Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic.