I have a set of data obtained from a automated process that will not fit any distribution? when i do a probability plot i have an S shape , which means there are extream outliers .. I have tried using indiviual distribution identifiaction but that does not work..
be notified via email.Just lump all the observations together and calculate the capability indices. If you are lucky, they will be high enough that your won’t need to worry about making an outofspecification product.
By the way, if you take enough observations on any process, the output will not fit any distribution. You should probably look for the reason you have the outliers and try to control the process.0July 14, 2008 at 12:00 am #173794
be notified via email.“…if you take enough observations on any process, the output will not fit any distribution.”
Now, that’s a pretty ignorant statement.
Where did you learn statistics? Where do you work? GE?
Your advise is very poor.
If you have an S in an NPP, you must have outliers, or your distribution is unstable and it shifts with time, or there is mixtures.
However, your answer implies, he collected too much data, thus that’s the problem.
Hello, Michael Mead. Please just ask questions, don’t try to answer them.
be notified via email.I have said it before, I don’t make these statements up. It came from…this one I can’t quote at the moment. Actually, I think it was Deming. However the idea is simple, The distributions we use are models, they are not “real”. We can do tests to see if our data is significantly different from one model or another but it is never certain.
The word is “advice”. I only give responses to problems I have faced in my years of experience. I would not imply that he has collected too much data, only that he should consider using statistics from a distribution type if it firs reasonably well and there is no theoretical reason that it will not fit. It is like applying the CampMeidell requirements to call a distribution “normal.”
I attended 12 univsities and taught at 5. I have never worked at GE. I have worked mostly in factories, from 10 people to 17,000.
I don’t criticize other people, but apparently I am not part of the “Good Old Boys” club here–which encompasses all there knowledge ther is about SixSigma and quality improvement. Excuse me.0July 14, 2008 at 5:33 am #173799Michael Mead,Do not insist, what you said makes no sense at all. you are suggesting that he increases his sample sizes and runs the capability studies as if the data were normal. And if he is “lucky, they will be high enough that your won’t need to worry about making an outofspecification product”.Luck should not be invoked here. If you use nonnormal data to run capability studies as if they were normal, you will end up having a Cpk and a Ppk that would be misleading. How about suggesting that he reviews his processes to make sure that they are stable and in control before thinking about conducting a capability study.
be notified via email.Hello Joe,
What I said makes sense. Can you show me a set of samples where every point will form the theoretical Gaussian distribution? Now you make me think I should find the source of that statement.
It is kind of strange, there are some people here who insist I don’t know what I am talking about because all my knowledge is “strictly textbook” and others who insist I my comments are too simplistic, and not based on any theory. I am confused and don’t know which I am.
be notified via email.To the original question. There are two easy ways to do this, and one hard one
1. If the data meets the CampMeidell requirements, then you can estimate the the mean and standard deviation. Then calculate the number of standard deviations between the sample mean and the specification limits, just like for CpL and CpU. Using the smaller of those, the proportion outside the specification limits cannot be more than 1/2.25S2
Thus where the nearest limit is 3s away, (CpK=1) the normal distribution would say the outside proportion is 0.27%. Using the CampMeidell theorem the proportion would be less than 4.93%
2. If your data does not allow you to use the CampMeidell calculation, use Tchebycheff’s theorem, which is valid for ANY distribution.
Perform the calculations above until you find the distances between the sample mean and the nearest specification limit in terms of the standard deviation. The maximum proportion outside the limits will be no more than 1/S2
Using Tchebycheff’s theorem where the CpK = 1, the proportion would be less than11.1%
3. Dr. A. J. Duncan (1974) presented a method where the quality characteristic can be described by a Pearson type III distribution. I believe this is used by many statistical pacakages which fit a Pearson’s curve to the data, such as Datalyzer Spectrum by Stephen Computer Services. Regarding this procedure, I have seen the results, but I don’t know the formula or calculations.
Continued good luck.0July 14, 2008 at 7:20 am #1738031. Try a Box Cow transform, see if it could advise you a Lamma value for transforming to normal. Or2. Plot a run chat first to see the data in large picture, see if there are few outliers contribute the non normal distribution. Or3. How do you get the data? check the measurement system through different tools (gauge R&R).
be notified via email.Ms
I am going to try to simplify this a little bit.Perform Cpk and Ppk calculations. Is your process in Control?
Perform a Run Chart. What can you tell from the run Chart? Does your process meet the rules for a process that is in control? If not do you have assignable cause for the out of control points?
You must determine the source/cause of your “Extreme Outliers”
For sake of analyzing data, remove these extreme outliers perform the probability plot again. What is the results? Do you have normal data, or just another variation of an S shape?
My suggestion from here is to perform a Box Cox Transformation and then a Normality Test on the transformed Data. I suspect the data will still not be normal due to a process that is completely out of control.0July 14, 2008 at 1:46 pm #173807Your assumption that the s shape means extreme outliers isn’t necessarily right.
For example, a uniform distribution makes a very nice s shape. So does a distribution that is too peaked (too much in the middle). try putting the data into excel only to get values for Skewness and kurtosis (Minitab probably does this too).
Take a look at a histogram and tell us what it looks like.0July 14, 2008 at 1:49 pm #173808Just a hint – when you try to impress with your education, you’ve already lost the arguement unless of course you are in the military and went to one of the academies (ring knockers).
12 universites? Did you keep flunking out?0July 14, 2008 at 1:50 pm #173809Can you tell me how to do one of those Box Cows?
be notified via email.I don’t try to impress anybody. The last guy asked where I learned statistics, and it is what it is.
I did not usually flunk out, just get tired of it or move to another job.
By the way, I probably should not have used “s” as the variable in my last post. It might be confused with the sample standard deviation. Apparently the convention is to use “t” to stand for the distance between the sample mean and the specification limit.0July 14, 2008 at 5:35 pm #173815I do not understand why so unhappy with Mr Mead. I do think sir though that to be clear is better. I am not certain of the particulars but I hope all advice is given in good spirits.
0July 14, 2008 at 5:46 pm #173816
be notified via email.Is that anything like stump breaking a mare?
be notified via email.Rather than getting hung up on what type of distribution or capability analysis you have, why not start collecting info on the X’s in your process simultaneously so you can begin to narrow the reason for your variation.
If you insist on something, take the ppm defective noted of the actual results, then when you find the reasons get the new ppm defective and see if you improved.
Good luck0July 15, 2008 at 12:05 am #173822Excellent advice! You are starting to sound alot like Mike!
be notified via email.Do you mean Stan or Mike?
Please do not compare me to Stan. I have my own motive.
Stan is a breed apart. He has profound knowledge, but it gives me the impression that he has run out of patience to read ignorance.
In my case, the motive is different. I believe, it is a disservice, to the seekers of knowledge, to be mislead with poor advise from people that display arrogance whilse disclosing their ignorance. If you type away advise with sheer ignorance, then you should suffer the consequences of being corrected by someone more experienced and knowledgeable than you. If you don’t like it, then be more humble in your reply.
be notified via email.Well Mike Carnell does tend to surround himself with a well balanced team…..
Of course the few times we don’t agree, it’s Stan’s fault since Stan got to me first and wizened me up from grass hopper to MBB…. Stan doesn’t take all the credit, a few others had to straighten me out.
be notified via email.I advise you to check how to spell “advice”.
:) ….. on pain medication so I’m feeling humorous for first time in a few days…0July 15, 2008 at 4:51 pm #173840Rothchild gets the “Pancake bunny” award for the day…dufus.
