Need Clarifications about Normal Data
Six Sigma – iSixSigma › Forums › Old Forums › General › Need Clarifications about Normal Data
- This topic has 14 replies, 11 voices, and was last updated 18 years, 8 months ago by
Vivek Shrivastava.
-
AuthorPosts
-
October 22, 2003 at 10:25 am #33659
Linga ReddyParticipant@Linga-ReddyInclude @Linga-Reddy in your post and this person will
be notified via email.Hi,
I am doing GB Project to reduce Data loading Process time from 61 minutes to 10 minutes (I have defined every thing in my Define and Measure Phases).
Actual data is Normal, But in Analyze phase (Step 4), I found that Data is Not Normal (With the help of Run Chart and Normality test ‘P Value less than 0.05’. I am getting P-value= 0), If Data is not normal, How to proceed to identify the initial Process capabilities.
Thanks in advance.
Thanks&Regards, Linga0October 22, 2003 at 10:34 am #91333Use subgrouping to solve the normality issue.
If you are reducing a process average, why is normality important?
You may have a one-sided distribution, in which case normality is impossible.
Actual Data is Normal but Data is not. How is that possible?
0October 22, 2003 at 10:51 am #91334
Linga ReddyParticipant@Linga-ReddyInclude @Linga-Reddy in your post and this person will
be notified via email.Hi Dave, Thanks for you reply.
As you said, My data is one-sided Distribution.
As per my operational Defenition, I am considering all the activities as one operation.
Operational Definition: Time spent on data load process = Time Spent on Logging into OFA + Time Spent on Running Delete Data Loader program + Time Spent on Running Data Loader Program + Time Spent on Running Solves + Time Spent on Distribute Structures/Data + Time Spent on Email to Intimate the users about Data load.
Now, How can I take subgrouping to resolve Normality issue.
Thanks&Regards, Linga0October 22, 2003 at 11:35 am #91335Linda, I would not be concerned with normalizing data. To determine process capability, assuming that 10 minutes is your upper spec limit (USL) look at the data. Any thing greater than 10 minutes is a defect. Divide defects by total data points then multiply result times 1000,000 to get Defects Per Million Opportunities. Then you can compute the process capability. Does that help?
0October 22, 2003 at 1:14 pm #913411) Dan’s advice is good.
2) A 1-sided distribution is naturally non-normal, so there is no reason to normalize it. To analyze process variability, you need special statistical tools, one of which is a Pearson Analysis. Certain software packages can do this for you. However, I don’t think that’s necessary for you, just follow Dan’s advice.
3) Remember that when X = A + B + C +…, the VARIANCE of X equals the sums of the Variances of A, B, C, etc. Variance is Sigma squared, so Sigma(X) is the square root of the sum. In other words, you can’t add Sigmas in a series, but you can add variances.0October 27, 2003 at 6:33 am #91641Hi Linga,
I agree with Dan. There are ways to normalize the continuous data by sub-grouping them. There are tools to handle the non-normal data, but in your case you can simply convert the data to discrete: Calculate the defects and the dpmo and thus the capability.
Regards,
Taran
0October 28, 2003 at 7:20 am #91698
Alpaslan TerekliParticipant@Alpaslan-TerekliInclude @Alpaslan-Terekli in your post and this person will
be notified via email.The question for me why actual data is normal but data is not. Did you check it with control chart? Is there any shift in process or some outlier? (like meintance or breakdown)
0October 28, 2003 at 7:47 am #91700
Soumish DevMember@Soumish-DevInclude @Soumish-Dev in your post and this person will
be notified via email.Would you please explain the difference between Actual data and Data rfom your project perspective?
0October 28, 2003 at 11:42 am #91707
Rick PastorMember@Rick-PastorInclude @Rick-Pastor in your post and this person will
be notified via email.More Clarification!
All the steps in your process appear to be the times required to complete a process step. That does not produce a one‑side distribution. The time needed to complete an event is equivalent to the measurement of a physical dimension. Both measurements, assuming random error, will end up normally distributed. With time you have a one‑side specification limit. You might want a two‑side specification limit when errors are generated when someone is working too fast.
To the question of why your data is not normal. Perhaps you do not have enough data points. How many data points in the population? Did you use the same person to take all the data? Are the subgroup normal, etc., etc? You may find that answering these questions can add insight into how to shorten the process time. For example, if several people were involved in the study you need to examine the data for each person. Perhaps one person need training!0October 28, 2003 at 12:06 pm #91708There is no difference. I just wanted a clarification of this statement:
“Actual data is Normal, But in Analyze phase (Step 4), I found that Data is Not Normal”
from this post: https://www.isixsigma.com/forum/showmessage.asp?messageID=347550October 29, 2003 at 5:39 pm #91776
Jim GrizzardParticipant@Jim-GrizzardInclude @Jim-Grizzard in your post and this person will
be notified via email.I have run into the same issue evaluating efficiency problems. Typically, the distributions will be skewd one way due to machine constraints. Bimodal distributions or distributions with outliers often point toward method or man issues. Some of the lean tools such as process mapping, pace studies and work sampling often provide good insight into actual capabilities.
0October 30, 2003 at 6:53 am #91794
HemanthParticipant@HemanthInclude @Hemanth in your post and this person will
be notified via email.Hi Reddy
I agree with Jim, your data must be skewed..and hence failing the normality test. What intrigues me is you said your actual data is normal but in analyse phase your data is not normal…
If I understand this correctly you had historical data (a huge data bank) which you initially checked for normality and found it passed the test. But when you did the test on data collected during measure phase it failed to pass the test. If this is true then it is possible because you are checking only a subset of the whole population, now your population may be behaving normally but the subset (or the sample) may not. I did this through MINITAB by generating samples of size 50 and 100 by randomising a normal distribution and I found that not all passed the normality test.
Now, coming to your query I would suggest dont get bogged down by calculating capablity as your target is to shift the mean from 60 to 10. In the end all you need to prove is if your improved process does give you an average processing time of 10 or not (hypothesis testing..)
Hope this was helpful.0October 30, 2003 at 6:56 am #91795
HemanthParticipant@HemanthInclude @Hemanth in your post and this person will
be notified via email.And if at all you wish to calculate the limits for your average, using Chebychev’s inequality might be of help.
0October 30, 2003 at 10:07 am #91802Bad advice!
Time data is typically skewed right (any data which is naturally bounded will skew away from the bound). The closer you get to your objective (the natural bound) the more the left side will bunch up. There are distributions known to model time data well – poission and exponential for example. Look to queueing therory for advice.0November 1, 2003 at 4:29 am #91920
Vivek ShrivastavaMember@Vivek-ShrivastavaInclude @Vivek-Shrivastava in your post and this person will
be notified via email.Hi linga,
Seeing your operational definition of the CTQ and your goal statement.. I’ll say that though you have one sided specification , your distribution should be two tailed and a normal and not one tailed.
Secondly one of the reason that you are not getting a normal distribution may be the discrimination you are choosing to measure the time… If you are choosing minutes, and your actuall process width is not more than 10 minute, take seconds. I expect u’ll get a normal distribution. Further I can comment only on seeing the histogram and normality graph ….on the potential causes of you not getting normality.
I hope this gives u a direction to think..
vivek
0 -
AuthorPosts
The forum ‘General’ is closed to new topics and replies.