SATURDAY, NOVEMBER 25, 2017
Font Size
Topic Dealing with Non-normal Data

Dealing with Non-normal Data

Home Forums General Forums Tools & Templates Dealing with Non-normal Data

This topic contains 16 replies, has 6 voices, and was last updated by  Fos 5 months, 4 weeks ago.

Viewing 17 posts - 1 through 17 (of 17 total)
  • Author
    Posts
  • #571510 Reply

    balasubramanian
    Reputation - 27
    Rank - Aluminum

    Hi,
    I have observed a data in one of the process of my company . The Normality test shows P value – 0.002 . Since it is non normal data , How to deal with this ? . Even I tried Box cox transformation of the data , the resultant data was also not normal . When I tried through a software StatAssist , it showed the data approximately follows Burr distribution .

    Actually the data is of Hardness of Powder metallurgy part , where there is excessive within part variation on account of pores in the part and this is natural . Can pl. anyone suggest how to handle this .

    Regards,
    Bala

    #571576 Reply

    You will have to tell us what it is that you are trying to do with your data before anyone can offer much in the way of advice.

    Control charting? – non-normality is not an issue.

    Process capability? – there are methods (Chapter 8 in Measuring Process Capability- Bothe)
    to handle this situation.

    t-test or ANOVA? – no problem – the t-test is robust to non-normal data as is ANOVA,

    Regression? – no issues here – approximate normality is an issue with residuals only.

    Concern that the process is out of control? – Maybe, maybe not – hardness data will be non-normal even when everything is in control because it is from a process that is bounded. Besides, there is no conncection between data that is normally distributed and whether or not the process is in control.

    #571584 Reply

    Can you post a picture of the histogram for the data?

    #571685 Reply

    Hi ,

    I have attached the histogram . I am trying to take process capability study in this process.
    Can pl. advise me the methodology for this .

    Attachments:
    #571701 Reply

    What kind of a sample size are we looking at? If that is a small sample then the fact that there is a gap between what looks to be about 625-650 probably isn’t much to worry about and the data looks to be normal enough to just press on with a capability measure calculation.

    On the other hand, if that histogram represents hundreds of samples then there are some intersting possibilities. You could have a bi-modal process with some curious process behavior around 625.

    #571793 Reply

    Nice gathering of data…the first step to success. :)

    #571962 Reply

    concur with @rbutler.

    #571963 Reply

    Also, if you are using Minitab, when you do the normality evaluation, do you get a plot like that below? If so, can you post that as well?

    Attachments:
    #571980 Reply

    Hi ,

    pl. find the attached normal probability plot . Actually i have taken almost 100 nos sample.
    1 no of sample randomly selected in each shift on each day for one month .

    Attachments:
    #572070 Reply

    @b1a5l9a2 – OK. We’re getting closer. Can you observe, or ask the workers, if there is adjustment going on to bring the value back to nominal? It looks to me that the lower side is happening randomly, but when the values get to the upper side, an adjustment is made to get the value back to target.

    If this is the case, then a fundamental premise is being violated in evaluating normality – that of outside adjustment of the data.

    Looking at your probability plot, we use something called the “fat pencil test.” Back when these graphs were created by hand, one would take the pencil used and lay it over the data. If the pencil covered the data points, you could be fairly confident of normality. Now, with statistical tests able to calculate probabilities, we tend to rely on them. However, the statistics are susceptible to individual points which can influence the statistics that visual examination would call “close enough.”

    As @rbutler states, the question as to normality depends on the use of the data. Many statistical tests are robust to non-normality, particularly when the data is similar to what you have presented.

    If I were mentoring you as one of my belts, I would have you check on the adjustment. If that’s happening, then I would go on and accept normality based on the histogram and prob plot. If not, then I would check on the sensitivity of the stat test that I’m looking to apply and see if it is robust to non-normality, and if so, then proceed. If it is sensitive to normality, then I would take some more data to ensure I have a full and complete picture. Even at 100 data points, you may have only captured one side of the distribution and over more time/data it may fill out.

    Hope this helps.

    #572390 Reply

    Don’t forget….are the data points giving enough precision, use an MSA to check.

    However, use your process map and gather data on X’s in the process and see if there’s an explanation/confirmation of the second grouping of data on the right.

    #572514 Reply

    When looking at non-normal data, rule out a few things before trying to calculate the process capability (and these are good general guidelines which can be easily forgotten if the data just so happens to be normally distributed):

    1. Are you looking at different levels of an X being captured? Run a dot plot, and SPC, and a few other graphs, research the process. With 100 samples, and the data looking bi-modal, you’ll want to rule this out first. I almost got “burned” by this once. I just wanted the probability of exceeding the specification limit and was so eager for “just the answer” that I initially missed that there were two separate behaviors going on in my data: normal conditions and when there were special events going on at the company.

    2. Perhaps the process is unstable? Stability is a requirement of most distributions. An SPC chart can help you determine if you are seeing random noise or potentially special cause noise (remember, the data needs to be in time sequence, if not you can only use Test 1).

    3. Data is not truly continuous. Histograms are a tough tool to use to detect this specific measurement issue. Run a dot plot. If the data stacks in nice bins, then this may be some form of attribute data.

    4. Perhaps the data is just naturally non-normally distributed? But I would check on the other 3 conditions first to be safe. If you still feel it is naturally non-normally occurring data, there are a wide variety of other distributions besides the normal distribution and transformations, that may be suitable models.

    As a last resort, I’ve seen some people convert the data to pass/fail data and treat it as a binomial process capability. This is not my favorite and there are arguments fore and against this approach. But, I’d rule the other conditions out and truly understand why the data was non-normally distributed before considering this strategy.

    #593836 Reply

    Jim Frost

    If you’re using Minitab, I’d recommend trying the following tool.
    Stat > Quality Tools > Individual distribution identification

    This tool will check your data against 14 distribution and 2 transformations. It might provide some insights.

    #601932 Reply

    it’s a great tool! I still remember when it showed up and I’m like….how did people live without this cool tool

    #602176 Reply

    You can also use a distribution fitting tool in CrystalBall.

    #611917 Reply

    Fos

    But some times this tool is not sufficient and any distribution or transformation have a p-value > alfa.
    And this case you need to discretize the data and calculate inthe hand, but for our luck you will find excel tables in the internet.
    And if you have more than one distribution or transformation with p-value > alfa use with the less AD (Anderson Darling number).

    #611919 Reply

    Fos

    But this tool is used only to do the capability of no normal data.

Viewing 17 posts - 1 through 17 (of 17 total)

Register Now

  • Stop this in-your-face notice
  • Reserve your username
  • Follow people you like, learn from
  • Extend your profile
  • Gain reputation for your contributions
  • No annoying captchas across site
And much more! C'mon, register now.

Reply To: Dealing with Non-normal Data
Your information:






5S and Lean eBooks
GAGEpack for Quality Assurance
Six Sigma Statistical and Graphical Analysis with SigmaXL
Six Sigma Online Certification: White, Yellow, Green and Black Belt

Lean Six Sigma Project Tracking

Lean and Six Sigma Project Examples
Six Sigma Online Certification: White, Yellow, Green and Black Belt

Find the Perfect Six Sigma Job

Login Form