iSixSigma

Dealing with Non-normal Data

Six Sigma – iSixSigma Forums General Forums Tools & Templates Dealing with Non-normal Data

Viewing 18 posts - 1 through 18 (of 18 total)
  • Author
    Posts
  • #55680

    balasubramanian
    Guest

    Hi,
    I have observed a data in one of the process of my company . The Normality test shows P value – 0.002 . Since it is non normal data , How to deal with this ? . Even I tried Box cox transformation of the data , the resultant data was also not normal . When I tried through a software StatAssist , it showed the data approximately follows Burr distribution .

    Actually the data is of Hardness of Powder metallurgy part , where there is excessive within part variation on account of pores in the part and this is natural . Can pl. anyone suggest how to handle this .

    Regards,
    Bala

    0
    #201141

    Robert Butler
    Participant

    You will have to tell us what it is that you are trying to do with your data before anyone can offer much in the way of advice.

    Control charting? – non-normality is not an issue.

    Process capability? – there are methods (Chapter 8 in Measuring Process Capability- Bothe)
    to handle this situation.

    t-test or ANOVA? – no problem – the t-test is robust to non-normal data as is ANOVA,

    Regression? – no issues here – approximate normality is an issue with residuals only.

    Concern that the process is out of control? – Maybe, maybe not – hardness data will be non-normal even when everything is in control because it is from a process that is bounded. Besides, there is no conncection between data that is normally distributed and whether or not the process is in control.

    0
    #201142

    MBBinWI
    Participant

    Can you post a picture of the histogram for the data?

    0
    #201143

    b1a5l9a2
    Participant

    Hi ,

    I have attached the histogram . I am trying to take process capability study in this process.
    Can pl. advise me the methodology for this .

    0
    #201144

    Robert Butler
    Participant

    What kind of a sample size are we looking at? If that is a small sample then the fact that there is a gap between what looks to be about 625-650 probably isn’t much to worry about and the data looks to be normal enough to just press on with a capability measure calculation.

    On the other hand, if that histogram represents hundreds of samples then there are some intersting possibilities. You could have a bi-modal process with some curious process behavior around 625.

    0
    #201145

    Chris Seider
    Participant

    Nice gathering of data…the first step to success. :)

    0
    #201146

    MBBinWI
    Participant

    concur with @rbutler.

    0
    #201147

    MBBinWI
    Participant

    Also, if you are using Minitab, when you do the normality evaluation, do you get a plot like that below? If so, can you post that as well?

    0
    #201148

    b1a5l9a2
    Participant

    Hi ,

    pl. find the attached normal probability plot . Actually i have taken almost 100 nos sample.
    1 no of sample randomly selected in each shift on each day for one month .

    0
    #201149

    MBBinWI
    Participant

    @b1a5l9a2 – OK. We’re getting closer. Can you observe, or ask the workers, if there is adjustment going on to bring the value back to nominal? It looks to me that the lower side is happening randomly, but when the values get to the upper side, an adjustment is made to get the value back to target.

    If this is the case, then a fundamental premise is being violated in evaluating normality – that of outside adjustment of the data.

    Looking at your probability plot, we use something called the “fat pencil test.” Back when these graphs were created by hand, one would take the pencil used and lay it over the data. If the pencil covered the data points, you could be fairly confident of normality. Now, with statistical tests able to calculate probabilities, we tend to rely on them. However, the statistics are susceptible to individual points which can influence the statistics that visual examination would call “close enough.”

    As @rbutler states, the question as to normality depends on the use of the data. Many statistical tests are robust to non-normality, particularly when the data is similar to what you have presented.

    If I were mentoring you as one of my belts, I would have you check on the adjustment. If that’s happening, then I would go on and accept normality based on the histogram and prob plot. If not, then I would check on the sensitivity of the stat test that I’m looking to apply and see if it is robust to non-normality, and if so, then proceed. If it is sensitive to normality, then I would take some more data to ensure I have a full and complete picture. Even at 100 data points, you may have only captured one side of the distribution and over more time/data it may fill out.

    Hope this helps.

    0
    #201158

    Chris Seider
    Participant

    Don’t forget….are the data points giving enough precision, use an MSA to check.

    However, use your process map and gather data on X’s in the process and see if there’s an explanation/confirmation of the second grouping of data on the right.

    0
    #201168

    Nik
    Participant

    When looking at non-normal data, rule out a few things before trying to calculate the process capability (and these are good general guidelines which can be easily forgotten if the data just so happens to be normally distributed):

    1. Are you looking at different levels of an X being captured? Run a dot plot, and SPC, and a few other graphs, research the process. With 100 samples, and the data looking bi-modal, you’ll want to rule this out first. I almost got “burned” by this once. I just wanted the probability of exceeding the specification limit and was so eager for “just the answer” that I initially missed that there were two separate behaviors going on in my data: normal conditions and when there were special events going on at the company.

    2. Perhaps the process is unstable? Stability is a requirement of most distributions. An SPC chart can help you determine if you are seeing random noise or potentially special cause noise (remember, the data needs to be in time sequence, if not you can only use Test 1).

    3. Data is not truly continuous. Histograms are a tough tool to use to detect this specific measurement issue. Run a dot plot. If the data stacks in nice bins, then this may be some form of attribute data.

    4. Perhaps the data is just naturally non-normally distributed? But I would check on the other 3 conditions first to be safe. If you still feel it is naturally non-normally occurring data, there are a wide variety of other distributions besides the normal distribution and transformations, that may be suitable models.

    As a last resort, I’ve seen some people convert the data to pass/fail data and treat it as a binomial process capability. This is not my favorite and there are arguments fore and against this approach. But, I’d rule the other conditions out and truly understand why the data was non-normally distributed before considering this strategy.

    0
    #201362

    Jim Frost
    Guest

    If you’re using Minitab, I’d recommend trying the following tool.
    Stat > Quality Tools > Individual distribution identification

    This tool will check your data against 14 distribution and 2 transformations. It might provide some insights.

    0
    #201390

    Chris Seider
    Participant

    it’s a great tool! I still remember when it showed up and I’m like….how did people live without this cool tool

    0
    #201411

    MBBinWI
    Participant

    You can also use a distribution fitting tool in CrystalBall.

    0
    #201514

    But some times this tool is not sufficient and any distribution or transformation have a p-value > alfa.
    And this case you need to discretize the data and calculate inthe hand, but for our luck you will find excel tables in the internet.
    And if you have more than one distribution or transformation with p-value > alfa use with the less AD (Anderson Darling number).

    0
    #201515

    But this tool is used only to do the capability of no normal data.

    0
    #202305

    Dave Franco
    Guest

    Please check the below plot and let us know what you’re trying to achieve and what data you have ?

    Normal Plot

    0
Viewing 18 posts - 1 through 18 (of 18 total)

You must be logged in to reply to this topic.