iSixSigma

Anderson-Darling Test

Six Sigma – iSixSigma Forums Old Forums General Anderson-Darling Test

Viewing 17 posts - 1 through 17 (of 17 total)
  • Author
    Posts
  • #44991

    Brit
    Participant

    I noticed when many of the posts regard testing normality, we tend to use the Anderson-Darling test.  Wondered why the Shapiro-Wilk test or KSL isn’t mentioned as much. Can any of you stat guys/gals give me a reason, if one exists, why we should use one or the other.  I know KSL is better for sample sizes greater than 2000, but was wondering specifically about Shapiro vs Anderson.
    Thanks!

    0
    #145536

    Eric Maass
    Participant

    Hi Brit,
    I think these various normality tests have been discussed several times before in this forum, so you might want to do a quick search on that.I was under the impression that Shapiro-Wilk is more popular / more widely used than Anderson-Darling, but both are considered among the power powerful tests for normality.  I’ve sometimes joked that the Anderson-Darling test sounds like the married name for Wendy from Peter Pan, and KSL sounds like a merger of two vodkas.I think Kolmogorov-Smirnov test is of historical interest, but not as powerful as either Anderson-Darling or Shapiro-Wilk or the Ryan-Joiner test (which sounds vaguely similar to the names of those who developed Minitab).

    0
    #145537

    Brit
    Participant

    Thanks Eric.
    I was using Shapiro mostly and was just wondering if I was making a mistake – or more importantly, informing others to do the same.  Again – thanks for the reply.

    0
    #145541

    Robert Butler
    Participant

     The main difference between the two tests is that Anderson-Darling emphasizes the lack of normality in the tails of a distribution whereas the Shapiro-Wilks focuses on the lack of symmetry (essentially it focuses on the center of the distribution). 
      As far as I’m concerned, the biggest problem with either of these tests is just that – they are mathematical tests and when used without any other kind of effort on the part of the investigator they can generate concern where none is needed.
      At the risk of sounding like a broken CD (records are so 20th Century :-) ) before you try to use any test for normality you need to go out and calibrate your eyeballs with  respect to plots of normal data.  Every stat package out there has a plethora of random number generators.  Pick the one that will generate random numbers from a normal distribution and then generate samples of 10,20,30, 60, and 120 and PLOT THE DATA ON NORMAL PROBABILITY PAPER!!!!! 
      After you have plotted the data run the Anderson-Darling and the Shapiro-Wilks and whatever other test you fancy on each of those subsets and note the results on the respective normal probability plots. Paste those plots up on the walls in your cubical and really look at them.  If you do this you will have a direct visual reference to what perfectly normal data looks like for these various sample sizes and you will have some idea of how non-normal they look and how badly they fail or how well they pass your favorite test. 
     With your cubical papered thus you will now be in the position of making sound judgments with respect to the next block of real data shipped your way.
      When it arrives first plot it on normal probability paper and compare its visual appearance to what you have on your walls.  Find the plot that most closely approximates what you have in front of you and note the corresponding normality test results.  Then run your test of choice and see how well those numbers compare with what you have on your wall.  Between the graphs and the tests you should be able to make a good judgment call about the seriousness (or lack thereof) of the non-normal nature of your data.
     

    0
    #145542

    Savage
    Participant

    Robert,
    I’ve been meaning to ask you this for a while now.  You have many times recommended plotting data on normal probability paper.  Why?  How is that different from a probability test in a stat program?  I’m still learning so any insight is appreciated.
    Matt

    0
    #145546

    Robert Butler
    Participant

     The short answer – a picture is worth a thousand word. 
      The long answer: Most stat packages I’ve seen have the normal probability plot as an option.  When you plot the data you can see what the test is testing since the normal probability plot will present your data relative to the diagonal line which represents perfect normality.  Real data and random samples from a random number generator with an underlying normal distribution will never fall exactly on the line. If it does fall exactly on the diagonal you know the data was cooked (this, by the way, is how they outed Sir Cyril Burt – the world famous abuser of the I.Q. concept) 
    If you run multiple random draws of 10 samples from a normal and multiple 20 samples etc. and plot them you can see just how non-normal these plots actually look (that is how far and where they stray from the diagonal line).  By looking at numerous examples of this type you will calibrate your vision and you will be able to judge the results of a normality test on more than just a pre-chosen level of significance.

    0
    #145579

    Darth
    Participant

    And the famous utterance of Roberto Duran when he threw in the towel was “No Maass” sounds vaguely familiar to the famous isicksigma.com poster.

    0
    #145581

    Brit
    Participant

    Wonderful suggestion.  Thanks for the advice.

    0
    #145628

    Eric Maass
    Participant

    LOL!
    Thanks, Darth!
    Best regards,Eric “No” Maass
    P.S. – I wonder whether George Lucas intended to have the term “Darth” mean the prefix “in” :Darth Vader –> In-VaderDarth Sidious-> In-Sidious
    (…or, perhaps he just was thinking about that famous country singer, Darth Brooks?)

    0
    #145629

    The Force
    Member

    Shapiro-Wilk or Ryan Joiner can test the normality of a given set of smaller data unlike Anderson Darling wherein data should be large enough to accurately test for normality.

    0
    #145631

    Darth
    Participant

    Opens up a whole new field of inquiry doesn’t it.  Love the Darth Brooks idea….I have some great visuals in my mind.  In fact, maybe Dr. Phil would agree to produce and promote a road show and country western concert featuring all the cleverly named Posters.

    0
    #145632

    Eric Maass
    Participant

    The mind literally boggles!
    So, of course I did a quick internet search – and – they already have a t-shirt for Darth Brooks!
    http://www.threadless.com/submission/55045/Darth_Brooks
     

    0
    #146413

    Jonathon Andell
    Participant

    Very well said! I often teach my students that no computer can turn data into information as well as the combination of the human eye and the human brain. I also insist that the data be displayed on a control chart or another time-sequence display. The two images (time dependent and distribution model) provie information that no test statistic can venture near.

    0
    #146418

    Savage
    Participant

    I understand what Robert is saying – fully agree with a picture of your data.
    What I don’t understand is why do it by hand on paper vs. having my software give me the same picture?  Is this more a personal preference or is it about “paying the price” to understand what the data is telling me? 
    Just curious.

    0
    #146420

    Jonathon Andell
    Participant

    I do mine using Minitab. A cannot see a reason to plot them manually. That reminds me, I have to dust my abacus.

    0
    #146432

    Robert Butler
    Participant

      Sorry, “plotting on paper” is just a figure of speech – a throwback to those dark days of the 1980’s when we actually had to do it that way- I do all of mine using Statistica and then I actually use an HP Color printer to put them on hardcopy for all to see.  :-)

    0
    #146443

    Barry U
    Participant

    Why do you want to test for normality ?

    0
Viewing 17 posts - 1 through 17 (of 17 total)

The forum ‘General’ is closed to new topics and replies.