iSixSigma

Normality : To Be or Not To Be ?

Six Sigma – iSixSigma Forums Old Forums General Normality : To Be or Not To Be ?

Viewing 42 posts - 1 through 42 (of 42 total)
  • Author
    Posts
  • #53149

    bbusa
    Participant

    Does Control Chart data need to be Normal? Over the last few weeks , we witnessed some shadow boxing on this subject. I am sure the last word is yet to be said on this subject .Various views & opinions have been presented by both the Defense & the Prosecution !The common practioneer ( like me) has been left more confused than ever before . To relive us of the pain & suffering , can we have the Final Verdict form the jury ? A ‘Yes’ OR ‘No’ Answerbbusa

    0
    #188448

    Darth
    Participant

    We call as our witnesses; Dr. Walter A. Shewhart and Dr. Donald Wheeler both of which stated, under oath, that the data does not need to be normal.

    0
    #188449

    MBBinUSA
    Participant

    Mr. Darth,Everybody knows the right thing to do is just transform everything.
    Dr. Shewhart didn’t have a nice program to make it easy and correct.What kind of BB training did you have anyway?

    0
    #188450

    Allattar
    Participant

    As Darth says, no control charts do not need to be normal.
    Just to expand on a few things though.
    With an Xbar chart, you rely on the central limit theorem.
    With the Range chart or S chart, ranges or standard deviations aren’t expected to be normal.Β  You will notice that not all tests are applied to these charts.
    However the I-chart is the interesting one.Β 
    Let me elaborate, if we have a set of data Weibully distributed with a shape of 1 and scale of 4.Β  This roughly equates to a normal distribution with mean of 4 and standard deviation of 4.
    Now +/- 3sd around the mean is -12 to +16.
    A normal distribution would have 0.00135 above +3 sd (16)
    The Weibull has 0.0183 above 16.
    The normal has 0.00135 below -3sd, or -12.
    The Weibull sees 0 below -12.
    The mean of 4 for a normal distribution is at the 50th percentile.Β  4 for the Weibull distribution is at the 63.2 percentile.
    9 points in a row below the line on a normal distribution has a probability of 0.00195, above the same.Β  So 9 points in a row either side of the line has a probability of 0.0039
    With the weibull distribution , 0.632^9 is 0.016, 0.0278^9 is very small, 9.9*10^-6.Β  So the chance of 9 points in a row for the Weibull here is 0.016.
    The chance of breaking those two tests, for this Weibull of shapeΒ 1, scale of 4, is higher than in a normal distribution.
    Moral of the story… understand why your data looks like it does, and then understand the implications for you :)Β 

    0
    #188451

    Allattar
    Participant

    Well you shouldnt just transform the data.
    You should ask, what shape should this data really follow?
    Is the data stable over time?
    Are their any reasons why it is not normally distributed?
    Plenty of mistakes are made just by going, oh it isn’t normal quick we must transform it.
    My favourite being that the data is roughly normal, but the measurements appear very discrete.Β  Looks normal, but fails a normality test, no transform helps there and you usually have people running round wondering what to do.
    Till you point out the measurements are all 5.1, 5.2, 5.3, 5.4 etc.. and you get told yes, becuase thats the resolution of the measurement device.

    0
    #188452

    Allattar
    Participant

    Should

    0
    #188453

    Allattar
    Participant

    grrr… Forums winning.
    Anyway was trying to add, what I missed earlier.
    That non-normal data is only really something to consider for I-charts.Β  See my other post.

    0
    #188456

    Robert Butler
    Participant

    For more details see the post below and the related thread.
    https://www.isixsigma.com/forum/showmessage.asp?messageID=141008

    0
    #188459

    Darth
    Participant

    We could also dredge up the recent thread/battle between Wheeler and Breyfogle. Bottomline:1. Charts are robust to non normality
    2. Has nothing to do with Central Limit Theorem
    3. I/MR with severe departures from normality (eg: cycle) may need to be transformed with great care.
    4. Read Shewhart and Wheeler’s books.I recently reviewed some BB curriculum which stated that the data must be normal and the process in control before using a control chart. ARGGGGGGGGGG.

    0
    #188460

    Allattar
    Participant

    An Xbar chart has everything to do with the central limit theorem.
    To suggest a plot over time of averages whereby you plot the standard deviation/number of samples in the subgroup as control limits.
    How can it not have anything to do with CLT, its practically a demonstration of it.
    I know its being picky, but you cannot dissasociate the ideas of CLT from an Xbar chart.Β 
    Β 

    0
    #188461

    Mikel
    Member

    Wrong, you don’t know what you are talking about.

    0
    #188462

    Darth
    Participant

    I will try to be kinder than my dear friend Stan. First please consider how the control limits are calculated. Second is a quote from Dr. Wheeler’s book:Wheeler did an experiment whereby he computed the three sigma coverage for 1,143 different non normal distributions. He demonstrated that he achieved much greater coverage than that using the three sigma limits. There, he says, Β“These wide regions where three-sigma limits will filter out the bulk of the routine variation are the reason why we do not need to define a Β“reference distributionΒ” in order to use a process behavior chart. They are also the reason why we do not need to test our data for normality before placing them on a chart. And this blanket of 99% or better coverage is why we do not need to invoke the blessing of the central limit theorem by averaging several values together before placing the result on a chart. Three-sigma limits bracket the bulk of the routine variation by brute force. Therefore, without knowing which probability model might approximate your process, you can still be reasonably sure that any point that falls outside your three-sigma limits is more likely to be due to a dominant assignable cause than it is to be part of the routine variation coming from the lesser causesΒ….The objective is to take the right action, rather than to find limits that correspond to a particular probability with high precision.Β”

    0
    #188473

    MBBinWI
    Participant

    or, to simplify:Β  Set up your control charts as if the data were normally distributed, with control limits at +/- 3 std dev’s.Β  This will filter out the mundane COMMON cause variation for most all actual distributions andΒ indicate whenΒ the SPECIAL causes are affecting the output.
    Therefore, the answer to the original question is NO, but set up your control chartsΒ as if they were normally distributedΒ (yes).

    0
    #188500

    Allattar
    Participant

    Sorry but I know better than you.
    Of course you can demonstrate most results will fit within +/- 3 standard deviations, and of course you can do that for means.Β  By the very nature of them being means.
    Its a little absurd to say I took these means and demonstrated that they all fall within here so it means we dont need to invoke.
    All the statement you made Darth implies we don’t need to invoke CLT as a justification, doesn’t mean its still not a present effect on your results. Which you could demonstrate by plotting the probability plot of your means.
    Its almost like saying you threw a ball at the ground, thereforeΒ we dont need gravity.
    I was agreeing, but also stating that it is often overlooked on individual data the implications of a very non normal distribution.Β  The essence of the post was to say, understand your data before panicking.
    You lot do make me laugh.Β Β 

    0
    #188501

    Mikel
    Member

    You know better?Wrong

    0
    #188502

    Darth
    Participant

    Well, you certainly have established your credibility and credentials enough for me to totally discard the work of Shewhart and Wheeler.Or you have proven yourself to be a complete boor and moron. After a scientifically conducted poll we have established the latter.And yes, Stan was consulted in the design of the poll questions so it has validity beyond which anyone can refute or dispute.

    0
    #188503

    Darth
    Participant

    This will reinforce the Boor’s contention:http://manufacture-engineering.suite101.com/article.cfm/central_limit_theoremThis from Wheeler himself:Myth Two: Control charts work because of the central limit theorem.The central limit theorem applies to subgroup averages (e.g., as the subgroup size increases, the histogram of the subgroup averages will, in the limit, become more “normal,” regardless of how the individual measurements are distributed). Because many statistical techniques utilize the central limit theorem, it’s only natural to assume that it’s the basis of the control chart. However, this isn’t the case. The central limit theorem describes the behavior of subgroup averages, but it doesn’t describe the behavior of the measures of dispersion. Moreover, there isn’t a need for the finesse of the central limit theorem when working with Shewhart’s charts, where three-sigma limits filter out 99 percent to 100 percent of the probable noise, leaving only the potential signals outside the limits. Because of the conservative nature of the three-sigma limits, the central limit theorem is irrelevant to Shewhart’s charts.Undoubtedly, this myth has been one of the greatest barriers to the effective use of control charts with management and process-industry data. When data are obtained one-value-per-time-period, it’s logical to use subgroups with a size of one. However, if you believe this myth to be true, you’ll feel compelled to average something to make use of the central limit theorem. But the rationality of the data analysis will be sacrificed to superstition.From the Deming Network:http://deming-network.org/archive/98.02/msg00028.html

    0
    #188506

    MBBinWI
    Participant

    don’t forget our friend Robert Butler.

    0
    #188507

    Darth
    Participant

    Dr. Butler’s response led us to a previous thread regarding the need for normality of the data. Would like to hear what our esteemed colleague has to say about the relevance of the Central Limit Theorem and the claim that it is the foundation of the Shewhart chart.

    0
    #188508

    Robert Butler
    Participant

    Β  I guess I don’t see where you and I disagree Darth.Β  True there were individuals on that thread who were insisting on normality and CLT but the specific post I referenced is almost identical in content to the posts you have been making to this one.Β  The only reason for mentioning the entire thread was because I thought some of the posters to this one might want to see that the arguments for and against had already been made.

    0
    #188509

    Darth
    Participant

    No, we did not disagree but MBBinWI brought up your name and, as you know, we all value your input. Hope you have the trailer packed and will be heading south in a few days.

    0
    #188510

    Allattar
    Participant

    Im doing badly at making my point.
    So I ran a simulation, 100,000 individual results, or 100,000 subgroups.Β  Then counted the tests broken, finally displayed for you here as a percentage.
    Data as,
    Normal distribution, Mean 0 sd 1 individual results
    Normal distribution, Mean 0, sd 1, subgroup size 5
    Weibull distribution, Shape 1 scale 4, individual results
    Weibull distribution, shape 1, scale 4, subgroup size 5
    Weibull distribution, shape 1, scale 4, subgroup size 10.
    Test 2 is 9 points in a row, test 3 is 6 points, just for reference
    (Not hopeful this will format correctly)
    Test,Β I,Β Xbar,Β W I,Β W Xbar,Β Weib sub 10,1,Β 0.28%,Β 0.27%,Β 2.58%,Β 0.95%,Β 0.67%2,Β 0.37%,Β 0.42%,Β 1.62%,Β 0.56%,Β 0.47%3,Β 0.04%,Β 0.05%,Β 0.03%,Β 0.03%,Β 0.04%4,Β 0.29%,Β 0.29%,Β 0.30%,Β 0.25%,Β 0.35%5,Β 0.21%,Β 0.18%,Β 0.77%,Β 0.31%,Β 0.29%6,Β 0.50%,Β 0.52%,Β 0.25%,Β 0.36%,Β 0.43%7,Β 0.29%,Β 0.35%,Β 1.04%,Β 0.47%,Β 0.37%8,Β 0.01%,Β 0.01%,Β 0.00%,Β 0.01%,Β 0.01%
    Compare the individuals chart for normal and Weibull here.Β  There is a big difference in the percentages that break test 1, 2, 5 and 7.
    As you subgroup the data with more data points, the differences between the tests drop.Β  The effect of the central limit theorem on averages.
    Thats my point here, it is evident in the data you collect.Β  Now clearly with only a subgroup size of 5 or 10 it wont make the distribution of averages normal, but its pushing it towards it.
    There is a very real difference though between tests being broken on individual data between normal and this very skewed Weibull distribution.Β  However the point being made is that for practical purposes if it breaks a test, its a good chance its a special cause.

    0
    #188511

    Allattar
    Participant

    Oh I do!

    0
    #188512

    Mikel
    Member

    Probabilities going all the way out to 2.5%Oh My God!How can we live life with having a signal that shows up 1 in 40! And
    as we all know, I’m not smart enough to look at something like a
    histogram to make sense of what I am seeing!The sky is falling, the sky is falling!

    0
    #188513

    Allattar
    Participant

    Interesting, I’m agreeing with you and your flaming?
    Β 
    Well everyone needs a hobby I guess.

    0
    #188514

    Mikel
    Member

    Honey,You must not recognize sarcasm when you see it.I’m not agreeing with you. Your argument is much ado about nothing.Those that over emphasize normality and waste time on transforms
    miss the opportunity to understand and improve.

    0
    #188521

    MBBinWI
    Participant

    I bet you’d look at a data set of 10,000 items each of A and B showing a statistically valid difference of 0.01 on a mean of 100 and shout the p-value from the rooftops.Β 
    Why don’t you try to understand what you have been told – by some of the most experienced practitioners in the industry – and learn something.Β  This lesson is probably the least expensive you’ll ever get.

    0
    #188551

    Jonathon Andell
    Participant

    It’s always entertaining reading this forum, especially when the technical debate derails and name-calling kicks in.For what It’s worth, I have found Darth and Robert Butler to have an excellent grasp of what works and what doesn’t.I admit that my statistical credentials fall below some on this list, but I also have been in the game for 20+ years. Here’s how I would approach a data set:1. Plot the raw data on a control chart.2. Use the best computer we ever will possess – the combination of our eyes and our brains – to determine whether blatant special cause exists. 2a. If special cause is detected, the first course of action would be to stabilize the process. Debating about transforming unstable process data ranks high among the most profoundly wasteful discussions this forum has entertained.3. Once the data are stable, you have the option of using a histogram, probability plot, Anderson-Darling, or other ways to decide whether the data appear to follow normality.4. If the data appear non-normal, use something akin to Minitab’s distribution ID utility to get an idea of what distribution might be a good model for the data.5. For extra credit, read James King’s book “Probability Charts for Decision Making.” It discusses kinds of natural phenomena that can give rise to various distributions – which in turn can lead to insights about the process. I hope I don’t need to remind folks that these insights are what we are after.Side comment: I prefer distribution ID over Box-Cox for a few reasons: – More distributions from which to select – The aforementioned potential insight into the process – Once we select a distribution model, Minitab has a sweet utility called “Capability non-normal.” It uses the chosen distribution model to estimate the probabilities and computes an “equivalent” Z value. Best of all, it makes a chart with the real data in real units of measure – no need to back-transform anything – and displays the non-normal distribution curve. I find that students and managers both appreciate this display.Once you’ve done all that, you still have the option of calling people morons.

    0
    #188552

    Darth
    Participant

    Jonathon,
    Always good to hear from you. Thanks for adding to the thread.

    0
    #188553

    Jonathon Andell
    Participant

    It just isn’t a complete thread without people calling each other morons. Reminds me of some of the playground arguments from my first childhood.The Breyfogle-Wheeler debate was a lot of fun to watch. An odd mix of contradictory dogmas, selective simulations, and occasional bursts of violent agreement. Reminds me of a decades-old issue of “Quality Engineering” with a debate over Taguchi methods.

    0
    #188554

    Darth
    Participant

    Taguchi???? Anyone using his approach have to be morons…..OK feel at home now? You coming to SoBe for the conference?

    0
    #188555

    Jonathon Andell
    Participant

    Travel on my own nickel isn’t a very good option these days.I do like a lot of what Taguchi advocates prior to selecting a matrix, but I rarely use any of his matrices. Did you ever read Schmidt & Launsby’s book on DOE? Interesting outlook.Can I be a moron, too?

    0
    #188558

    GB
    Participant

    Shameless, Darth…driving up your post count like that! It’s only an MVP award!
    ;-P

    0
    #188560

    Darth
    Participant

    And your point is…??????? Looking forward to a fun week. I decided to take one of the Master Workshops on Monday and do a site tour on Thursday. Might as well take advantage of the conference plus it gives me time to sign autographs for my adoring fans :-).

    0
    #188562

    MBBinWI
    Participant

    Both of them?Β  Shouldn’t take long.
    (I just couldn’t resist)

    0
    #188563

    Darth
    Participant

    Yes, you and HeeBee. Stan is jealous of my fan base so he won’t even talk to me. Carnell wanted to write a long winded autograph so he is out.

    0
    #188564

    GB
    Participant

    finger’s crossed that my injury stays dormant… I want to do the boat tour, but my flight conflicts :-(

    0
    #188566

    hbgb2
    Participant

    Now that was funny! (Your’s too, MBBGLENNFIDDITCHDREKFANINWI).Actually, My vote for post count is for Stevo.

    0
    #188568

    Mikel
    Member

    Darth?Hey just because you live in the desert doesn’t mean you should eat
    the peyote buttons.

    0
    #188569

    Mikel
    Member

    You whole fan base is at 13th St on SoBe if you catch my drift.

    0
    #188570

    Jonathon Andell
    Participant

    Doesn’t mean I shouldn’t…

    0
    #188637

    Forrest W. Breyfogle III
    Member

    If you are not willing to keep an open mind to a paradigm shift, please read no further. Β 
    Β 
    I will be providing links to articles that demonstrate how some of the rules that we have been told by either individuals or within classes have issues.
    Β 
    A transformation that MAKES GOOD PHYSICAL SENSE can be very important, especially when we are trying to describe whether a process is capable of producing a desired response or not and also concerned about over reacting to common-cause variability as though it were special cause. Β 
    Β 
    The reason for this statement is described, in detail with simulated and real data, in the following Quality Digest articles (Again, if you are not willing to take the time to really read the articles with an open mind, do not waste your time opening the links):
    Β 
    – Β Non-normal data: To Transform or Not to Transform http://www.qualitydigest.com/inside/quality-insider-column/individuals-control-chart-and-data-normality.html Β Β 
    Β 
    – NOT Transforming the Data Can Be Fatal to Your Analysis: A case study, with real data, describes the need for data transformation.Β Β Β  http://www.qualitydigest.com/inside/six-sigma-column/not-transforming-data-can-be-fatal-your-analysis.html Β 
    Β 
    – Predictive Performance Measurements: Going Beyond Red-Yellow-Green Scorecards http://www.qualitydigest.com/inside/quality-insider-column/predictive-performance-measurements.html Β Β 
    Β 
    – Are Your Business Metrics Measuring the Right Thing? Don’t base your metrics on your organizational chart http://www.qualitydigest.com/inside/quality-insider-article/are-your-business-metrics-looking-right-thing.html Β Β 
    Β 
    Β 
    The point was made that an x-bar and R chart is a means to get around this issue because of the central limit theorem. Β This may be true if for the specific situation an x-bar and R chart could be used instead of an individual chart. Β However, x-bar and R charts have some fundamental issues too, as described in the published article available through the link http://www.smartersolutions.com/pdfs/online_database/asset.php?documentid=16
    Β 
    Whenever someone suggests, as I have done above, that there are issues with past methodologies there will be initial resistance. Β Β The implication of these points are far greater than just to consider transformations or not; e.g., this has a business system measurement potential that resolves the shortcoming of red-yellow-green scorecards. Β If anyone wants to discuss these important issues one-on-one let me know, [email protected].

    0
Viewing 42 posts - 1 through 42 (of 42 total)

The forum ‘General’ is closed to new topics and replies.