Minimum Sample size
Six Sigma – iSixSigma › Forums › Old Forums › General › Minimum Sample size
 This topic has 22 replies, 13 voices, and was last updated 18 years, 2 months ago by Tim Folkerts.

AuthorPosts

February 19, 2004 at 6:00 pm #34649
I have a process in which historically we have 20 % nonconformance. When I run trials what should be the minimum sample size I need .As the product is very expensive running standard 30 samples is not feasible. Will using Binomial condition of NP>5 is sufficient.
Can the forum help me decide.
Thanks0February 19, 2004 at 9:35 pm #95809
Maint. GuyParticipant@Maint.Guy Include @Maint.Guy in your post and this person will
be notified via email.If you don’t care about your confidence interval just guess!
Personally, I would go to the Poissan table and run a zero fail test at 95% confidence. I think you’ll find that you must have an absolute min. of 15 pieces of data.0February 20, 2004 at 2:33 am #95815
John H.Participant@JohnH. Include @JohnH. in your post and this person will
be notified via email.PA
What is your Acceptable Quality Level AQL for the Product ? What is your Producer’s Risk alpha (Type I error) What is your Rejectable Quality Level RQL for the Product and your Consumer’s Risk beta(Type II error) . The Producer’s Risk measures the probability that the producer will reject a lot when it is good. The consumer’s risk measures the probability that the consumer will accept the lot when it is bad. If I have that information I will answer your question.
John H.
0February 20, 2004 at 3:28 am #95817John I am running a test .So I do not know what my AQL will be. If you can elobarate I may be able to give you ythe data
0February 20, 2004 at 4:21 am #95823
John H.Participant@JohnH. Include @JohnH. in your post and this person will
be notified via email.PA
Re: AQL based on historical Results
Example: Based on your requirements, if for example you accept and ship lots that are 10% or less defective. The AQL = p1=.1 Now if you rejected lots that are 30 % or more defective RQL =p2 =.30February 20, 2004 at 5:33 pm #95854It really helps to know what you mean by ‘trials’. Are you trying to run single factor trials (hypothesis type tests) or are you trying to run multiple factor trials?
Are you dealing with true attributes data? or can you utilize variables data (for example, color is often categorized as attributes data or even categorical data, but can be measured as variables data – it’s just more time consuming and expensive…the same can be said for many test setups…they measure a variable characteristic, but report a pass or fail resutl – can you get teh variables data) There is also the possibility of utilizing ascale system to grade your characteristic: horrible, bad, OK, good, great categories. This provides more measurement resolution than simple good/bad and will allow you to use smaller sample sizes.
It would also be helpful if you could elaborate on any patterns to the nonconformance rate; eg is it truly random? does it vary by vendor lot? time of day? etc.
What is the severity of a defect? what is the consequence of implementing a corrective action that doesn’t work? (if it’s cheap, easy and quick to implement and the severity of not getting it right tehn you can tolerate more false positives (significance level) and use smaller sample sizes…
Each of the above can help in getting to a point where you can use small sample sizes, otherwise the suggestion to use the Poisson is the best we could do and you probably need replicates of the trials…or you coudl go to the multinomial distribution.
I know you aren’t gettign a succinct answer, but we need a more detailed question to provide more useful advice.0February 20, 2004 at 6:08 pm #95858
Heebeegeebee BBParticipant@HeebeegeebeeBB Include @HeebeegeebeeBB in your post and this person will
be notified via email.AQL is anathema to 6S.
0February 20, 2004 at 6:09 pm #95859Bev
It is a single factor trial. Comparing the existing quality parameter ,in this case leaks in terms of percent leaks , to a suggested modified process which is the trial. The question here is how many samples do we run to prove that the modified process makes a difference. The present leak rate presently is 30 % .
What I want to do is to run a modified process .But I dont know how many samples to run to prove statistically that it made a difference.
I hope I made myself clear . Please let me know I need some help .
Thanks0February 20, 2004 at 8:32 pm #95870
Tim FolkertsMember@TimFolkerts Include @TimFolkerts in your post and this person will
be notified via email.Lets look at some actual calculations. With your numbers – 15 trials, 20% failure – the normal approximation doesn’t really work well (we could run those numbers just to see how bad it is). You can use Excel to calculate some tables using the true binonial distribution. For example, with p = 0.2 = probabilty of failure, we can create a table with # of trials along the top and # of failures along the side. The numbers in the table are the odds of gatting that many or less failures. The bold number correspond to the mean.
15
20
30
50
1000
4%
1%
0%
0%
0%1
17%
7%
1%
0%
0%2
40%
21%
4%
0%
0%3
65%
41%
12%
1%
0%4
84%
63%
26%
2%
0%5
94%
80%
43%
5%
0%6
98%
91%
61%
10%
0%7
100%
97%
76%
19%
0%8
100%
99%
87%
31%
0%9
100%
100%
94%
44%
0%10
100%
100%
97%
58%
1%11
100%
100%
99%
71%
1%12
100%
100%
100%
81%
3%13
100%
100%
100%
89%
5%14
100%
100%
100%
94%
8%15
100%
100%
100%
97%
13%16
100%
100%
99%
19%17
100%
100%
99%
27%18
100%
100%
100%
36%19
100%
100%
100%
46%20
100%
100%
100%
56%
So if you did 15 trials, then the odds of getting 0 defects are 4%. Or put the other way around, if you see 0% defects, you can be fairly sure it came from a batch with less that 20% failures and you have a real improvement.
If you go to 30 trials, then the odds of getting 2 defects are again 4%. Now, even if you see 2/30 = 6.7% observed defects, you can be pretty sure it is a real improvement.
Go all the way to 100 trials, and the odds of getting 12 defects is about 4%. Now, even with 12/100 = 12% observed defects, you can be pretty sure it is a real improvement. With 9% observed defects, you can be basically 100% sure it is a real improvement.
Ultimately, you have to decide how sure you want to be and how big of an effect you want to observe. Try different numbers of trials in the spreadsheet to see the effect. Balance this against the cost of the testing and go for it.
Tim0February 20, 2004 at 9:22 pm #95872Hello PA,
While we typically don’t like discrete data for many reasons, not the least of which is a higher required sample size for the same confidence level, the reality for many of us is that’s all we have.
A formula that works for me is:
[(1.96/required level of precision)squared] x p x (1p)
Where:
1.96 = constant representing 95% confidence level
required level of precision = that desired from sample, in units of proportion.
p = proportion defective
Thus with 95% confidence level, and 20% typical defective, assume that our desired precision is 5% (if you end up with 15% defective, there may or may not be a real change…if you end up with less than 15%, then you are 95% confident that a change has occurred…but you only know to what new level within +/ 5% (again with 95% confidence).
[(1.96/0.05)squared] x 0.2 x 0.8 = 246.8, thus you wouuld need 247 samples.
If you are willing to sacrifice either confidence or precision, you can reduce the quantity of the sample…at 95% confidence and 0.10 precision then you need only 62 samples (actually it’s 61.4656 pieces, but I doubt you have only a fraction of a part to test..;)) ).
Incidentally, it’s interesting to note that the highest value of p x (1p) is 0.25 (where p=0.50)
aj
0February 21, 2004 at 12:25 am #95875
John H.Participant@JohnH. Include @JohnH. in your post and this person will
be notified via email.Heebee
First you learn to walk then you learn to run.
John H.0February 21, 2004 at 1:28 am #95876Hi PA,
The quotes from your posts:
1) I have a process in which historically we have 20 % nonconformance.
2) The present leak rate presently is 30 %.
Some initial questions for you:Is your process getting worse?
How do you define a failure in your test?
Thanks,
Mario0February 21, 2004 at 5:20 am #95878PA,
You have received some interesting advice on your question. I am not sure if you feel better or worse about your situation so let me see if I can help. Actually, Tim Folkerts strategy was on the right track so I am just going to build off of what he was doing.
What you want to do is test the minimum number of parts that will tell you that you have reduced the number of leaks. There is no reason to continue to test if the number of failures in the test reaches a level that will force you to an excessively large sample size. Also, you do not want to continue to test if no or few failures have occurred. Therefore, you will want to take a sequential approach.
First, the hypothesis you are testing is Ho: P = Pnull vs. Ha: P < Pnull
I will also assume that you will accept a .05 alpha risk.
You must establish the current proportion leaking (this is Pnull). In one of your posts you stated that it was 20%, in another you said it was 30%. The current level of leaks will determine the sample size; the smaller this proportion, the more samples that will be required.
Below is a table of alpha and beta risks for various sample sizes, failure rates, and null proportions. Basically what this table is doing is determining the sample size required to control the alpha risk level at about 5% when failures of zero to five are detected in the test. The beta risk is the risk of not detecting a 50% improvement when 1 to 4 more failures are detected. In other words, if 1 failure is detected in 14 samples with p.null at 0.2 and you determined that there was no improvement, then you have a 41.5% chance of not detecting an improvement of 50%. I will explain how to use this table below:
P.null Failures Sample Alpha +1 Beta +2 Beta +3 Beta +4 Beta
0.2 0 14 0.0440 0.415 0.158 0.044 0.009
0.2 1 22 0.0480 0.380 0.172 0.062 0.018
0.2 2 30 0.0442 0.353 0.175 0.073 0.026
0.2 3 37 0.0450 0.309 0.160 0.071 0.027
0.2 4 44 0.0440 0.274 0.146 0.068 0.028
0.2 5 50 0.0480 0.230 0.122 0.058 0.025
0.3 0 9 0.0404 0.401 0.141 0.034 0.006
0.3 1 14 0.0475 0.352 0.147 0.047 0.012
0.3 2 19 0.0462 0.316 0.144 0.054 0.016
0.3 3 24 0.0424 0.287 0.139 0.057 0.020
0.3 4 28 0.0474 0.235 0.115 0.049 0.018
0.3 5 33 0.0414 0.217 0.111 0.049 0.019
Procedure: (I will assume that P.null is .30)
1. Begin the test
Ø If 9 parts are tested without a failure, conclude that there has been a significant improvement at an alpha = .0404
Ø If 1 failure is detected, continue testing up to 14 parts, if no more failures then conclude that there has been a significant improvement at an alpha = .0475
Ø If 2 failures are detected before you reach 9, (beta risk is .141), this is a judgment call. You can stop testing if the beta risk is acceptable and conclude that you could not detect an improvement or continue to test up to 19 parts. However, this does not allow any more failures before the end of the test. If no more failures occur, conclude that there has been a significant improvement at an alpha = .0462
Ø If 1 failure occurs before you reach 9 and one after you have tested 9 but before you reach 14, continue testing up to 19 parts, if no more failures then conclude that there has been a significant improvement at an alpha = .0462
Ø If 3 failures are detected before you reach 9 (beta risk is .034), stop the test and conclude the no improvement could be detected.
Ø If 3 failures are detected before you reach 14, (beta risk is .147), this is, once again, a judgment call. You can stop testing if the beta risk is acceptable and conclude that you could not detect an improvement or continue to test up to 24 parts. However, this does not allow any more failures before the end of the test. If no more failures occur, conclude that there has been a significant improvement at an alpha = .0424
Ø If 1 failure occurs before you reach 9, one after you have tested 9, but before you reach 14, and one after 14 but before you reach 19, continue testing up to 24 parts, if no more failures conclude that there has been a significant improvement at an alpha = .0424
Ø I would recommend stopping if the number of failures reaches 4 unless the 4th occurs after 19. In this case, you will want to continue to 28 since the beta risk is still 28.7% with 4 at 24.
The same type of strategy can be employed if the P.null is .20.
I hope this helps. It is better than testing 247 samples as some one has suggested.
Statman0February 21, 2004 at 3:03 pm #95885Statman
Very interesting
Can you also please eloborate the concept a little more .I did understand what Tim did and that is what I was deliberating on.However I did not understand the way you have approached.
Also I am sorry to the forum for the typo The leak rate is 20 %
Thanks0February 21, 2004 at 8:04 pm #95888
Tim FolkertsMember@TimFolkerts Include @TimFolkerts in your post and this person will
be notified via email.It took me a bit to work through it too, but I like Statman’s answer. I hope he will agree with the following explanation. He is looking for the mathematically most efficient method to see if there is any improvement. He is doing the same type of cacluation I was, but quitting as soo as possible to avoid the expense of extra trials. Basically it comes down to 1) Test a single unit
2) Are you sure there is an improvement?
If yes, then stop testing.
3) Are you sure there is little or no improvement?
If yes, then stop.
4) Otherwise, go back to (1)You need to decide how much improvement you are looking for and how sure you want to be. Statman chose to look at the case where you are 95% certain there is some change, but you are not necessarily sure how much change.Since you seemed to understand my table, let me post a variation. Again, the number across is the number of trials, and the number down is the number of failures, and the odds of any given unit failing are 20%. 13 14 21 22 29 30
0 5.5% 4.4% 0.9% 0.7% 0.2% 0.1%
1 23.4% 19.8% 5.8% 4.8% 1.3% 1.1%
2 50.2% 44.8% 17.9% 15.4% 5.2% 4.4%
3 74.7% 69.8% 37.0% 33.2% 14.0% 12.3%
4 90.1% 87.0% 58.6% 54.3% 28.4% 25.5%
5 97.0% 95.6% 76.9% 73.3% 46.3% 42.8%
6 99.3% 98.8% 89.1% 86.7% 64.3% 60.7%
7 99.9% 99.8% 95.7% 94.4% 79.0% 76.1%
8 100.0% 100.0% 98.6% 98.0% 89.2% 87.1%
9 100.0% 100.0% 99.6% 99.4% 95.1% 93.9%
10 100.0% 100.0% 99.9% 99.8% 98.0% 97.4%
If you run 13 trial and get 0 failures, there is still a 5.5% chance the process still has 20% failure rate and that getting 0 was due to dumb luck. Thus you are only 94.5% sure there is an improvement. Anything less than 13 trials, even with no failures is not enough.BUT, if you run one more and still have no failures, then there is only a 4.4% chance of doing that well by dumb luck. Hence you are pretty sure there is some improvement. If you get here, you can quit and be (pretty) sure there was an improvement. But suppose you got a failure in these 14 runs. Getting 1 (or less) failures would occur 19.8% of the time, so now you have to continue doing more trials. At 21 trials and 1 failure, there is still a 5.8% chance that is is dumb luck, so you continue. But with 22 trials and only 1 failure, then you are down to 4.8% and you can quit.The “beta” colums are to decide when you can be sure the processes has definitely not improved to 10% failure rate.
SummaryPROS:
most efficient CONS:
requires good grasp of stats to develop the procedure.
requires a welltrained technician to follow the procedure.
Tim0February 22, 2004 at 5:10 pm #95901Thanks to you all . I think this is a good approach to follow.
0February 23, 2004 at 8:34 am #95918Hi Statman and Tim,
You guys did a great job regarding this question. Let me add a minor suggestion to this discussion.
With your testing approach, you want to prove that the new process has the nonconformance that is just lower than the current one. That may not be enough for PA because he needs to prove that the new process has the nonconformance that is not greater than the number that he thinks may be appropriate for his project . If the current P=0.30, he certainly needs a new process that may have P=0.15, 0.20 or something like that in order to claim any practical importance of his improvement.
If PA tests 9 units and got zero failure, for example, he may claim that he significantly improved the process although the new process with P=0.29 (a marginal improvement) will give him this result with alpha = 0.0458.
Thanks,
Mario
0March 5, 2004 at 5:50 am #96451The minimum sample size can also be calculated as follows:
Delta = Difference between Baseline Goal
Sigma = Std Deviation
Power = 0.9
Using Minitab:
Follow the instructions: Stat – Power and Sample Size – 2 sample ttest
Enter the above values in the respective field and required sample size will be displayed
0March 6, 2004 at 3:32 am #96522
Jonathon L. AndellParticipant@JonathonL.Andell Include @JonathonL.Andell in your post and this person will
be notified via email.I think the majority of the discussion describes quite well how to squeeze the most knowledge out of a single sampling event. If you must do so, I prefer AOQL over AQP, because the former protects the customer, the latter protects the supplier.
At the risk of pointing out the obvious: consider taking repeated samples of the process, and display the results an an attribute control chart. That way you can get an idea of how closely the process follows a distirbution, as well as whether it is changing over time. With luck, you may be able to extract that data from historical records…0March 6, 2004 at 5:08 pm #96528
Deelip WasadikarParticipant@DeelipWasadikar Include @DeelipWasadikar in your post and this person will
be notified via email.Dear PA,
Statistically based B Versus C test from DOE methodology can be highly useful and also shall provide information with 95% confidence with 3 good & 3 bad samples.
Thanks
Deelip0March 7, 2004 at 2:30 am #96533I would not wast my time and money on a single trial that will only tell me if there is an improvement or not. Especially if you need over 16 runs to determine the results.
You would be much better off with a designed experiment exploring the factors and interactions between factors that cause your leaks to occure.0March 11, 2004 at 10:34 pm #96785Deelip
can you expand this more
PA0March 11, 2004 at 11:35 pm #96788
Tim FolkertsMember@TimFolkerts Include @TimFolkerts in your post and this person will
be notified via email.Suppose you throw 6 pieces of paper into a hat – 3 labeled “B” and 3 labeled “C”. If you draw them out randomly, what are the odds that you will draw out the 3 B’s first?On the 1st draw, there is a 3/6 chance it is B
On the 2nd draw, there is a 2/5 chance it is B
On the 3rd draw, there is a 1/4 chance it is BThe odds that the first three are B (assuming it is random) are
(3/6) * (2/5) * (1/4) = 1/20 = 5%If the B’s and C’s are the same, there is a 5% chance of getting all of the B’s first. Thus if you really do get the three B’s first, you might reasonably assume that there was some reason for the B’s to come out first.Same thing if you have two processes. If the two processes are identical, the odds that the 3 from the B’s procesess are better (by whatever your metric is) is 5%. So if the B’s DO come out ranked 123, then presumably the B’s really ARE better.Tim F0 
AuthorPosts
The forum ‘General’ is closed to new topics and replies.