Easy estimate if a distribution are normal?

Six Sigma – iSixSigma Forums Old Forums General Easy estimate if a distribution are normal?

Viewing 8 posts - 1 through 8 (of 8 total)
  • Author
  • #52696


    Hello everyone.
    Is there anyone out there in “Stat-land” that have a easy way to determine if a distribution are normal or not if you not have access to any stat software?
    The reason are that I am doing a instruction (to be used of  “normal” persons ;-)) for Cp-indexes for normal distributed data. There I need a good an easy way to estimate if it is ok or not to use the indexes. So far it is by fair god samples an a histogram and judge by eye (not to skewed, dual tops, to peaked or flat, extreme outliers etc).
    Anyone have a good “shoot from the hip” way that works?



    If you have Excel or a similar spreadsheet program (even the free Open Office Calc), you can easily generate a normal probability plot that would result in a straight line if the data came from a normal distribution.  Judging a straight line with this approach would be a definite step forward compared to judging a histogram shape.
    See for example.
    This approach does not require any “stat software” or add-ins, but does require a spreadsheet.  If you already have a histogram, perhaps a spreadsheet is available to you.



    Contact me at 6sigmaguru(at)gmail(dot)com.I will send you an Excel spreadsheet that does the normal probability plot and Anderson-Darling test.Cheers, Alastair


    Forrest W. Breyfogle III

    It is important to also remember that data when making process capability assessments need to be from a stable process.
    Suggest that you also check out the following articles, which has links about the importance of selecting a transformation that makes physical sense and how to enhance process capability assessments/statements:
    – Resolving Common Issues with Performance Indices
    – “Predictive Performance Measurements”



    I use Excel to test Anderson Darling.  Send me your email and I’ll send you a copy of the template.



    A normally distributed data set will not necessarily produce a nice straight line.  In the past someone on this forum suggested that I generate “a bunch” of data sets that are normally distributed, each with about the number of data points I expected for the application I was tackling.  Then let Excel plot each of them.  Do that to get a feel of how much deviation you should expect for a normally distributed data set.
    I did it and found it very instructive.



    I absolutely agree that the process should be stable so you don’t fool your self.
    What I am trying to achieve are to have a general internal SOP (Standard Operation Procedure) to estimate a process capability for a given process. The scope is that it shall be written to use additional no software (than perhaps Excel or equal) an be able to used of persons with almost no or little experience of statistics. It shall also provide recommended sample sizes in order for what they try to achieve.
    My basic idea are to use estimates of CI for the indexes based on sample size using a target  for 1,33 or 1,66 and what you can allow in CI lower to determine the sample size.
    As you mention in your book “Implementing SixSigma-Smarter Solutions using statistical methods ” (standing firm on my bookshelf, in a true 5S sprit as an easy tool to reach) Ch 11 “Process capability and process performance metrics” about Cp and Pp indexes that how to form the sampling criteria are crucial (Ch 11.23 page 297).
    My initial thoughts are in the first step break down the metric in focus to plausible things that can have an effect using CTQ breakdown, Ishikawa diagram and/or other tools and from that score top x’s.
    From that make a sample plan that takes in consideration the x’s of (estimated) significance.
    Say for a example raw material are assumed to be a critical x. Sampling should be done over # lots of material calculating Cpk for etch lot. After all samples are done calculate Ppk and compare Cpk and Ppk.
    If they all checks out as normal (all Cpk and Ppk) with a reasonable difference and Ppk checks out to given limit the process gets go ahead.
    In case non normal data or to large shift in process the analysis are taken to the next level using transformations, non normal distributions, lurking factors etc (not included in this SOP)
    I don’t know how big pitfalls there is in the approach (glad for comments)  but it sure is a struggle (and a challenge) to write a SOP in layman’s terms on this topic. =)



    In to the “excel lab becnch” for me then.
    Nothing like hand on experience

Viewing 8 posts - 1 through 8 (of 8 total)

The forum ‘General’ is closed to new topics and replies.