Determining Sample Size
Six Sigma – iSixSigma › Forums › Old Forums › General › Determining Sample Size
 This topic has 10 replies, 9 voices, and was last updated 20 years, 1 month ago by Robert Goldhammer.

AuthorPosts

July 8, 2002 at 9:29 am #29814
I’m quite aware of the Power and Sample size Tool of Minitab, however, I am finding it hard to understand the relation of that tool to my problem. This is my problem: I have 5,602 samples for the year 2001 with standard deviation of 54.02. What I want is to get the number of samples that is powerful enough (say 95%) to give me the correct representation of the 5,602. Does anyone have any idea? Is there a formula for this? Though I need to check again Minitab since I might have missed something, but please if anyone got ideas, please fill me in. Thanks.
0July 8, 2002 at 12:40 pm #77025Since you are not performing a hypothesis test there is no need to use the power & sample size tool in minitab. The general rule of thumb is that you’d need 40 samples out of 5602 to describe 2001.
0July 8, 2002 at 1:01 pm #77028
SchultzParticipant@Charlie Include @Charlie in your post and this person will
be notified via email.Abasu,
How did you determine “The general rule of thumb is that you’d need 40 samples out of 5602 to describe 2001”? Clarification would be much appreciated.
Charlie0July 8, 2002 at 2:34 pm #77034
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.I think you will have to be more specific and explain what you consider to be a “correct representation” of your population. If this is a question concerning the representation of the populatiom mean and if you have checked to make sure that the population is normal then you could recast the question in the following manner:
Given that the population is normal, what kind of a sample should I take in order to be 95% certain that the allowable error in the sample mean is L?
For a normal population the confidence limits of the mean are
mean + 2S/sqrt(n)
Thus if you put L = 2S/sqrt(n) you can compute the sample size (n) that will give you this allowable error.
If, on the other hand, you are interested in characterizing the standard error for simple random sampling then the finite population correction for the population standard deviation will be
[S/sqrt(n)]*sqrt(1phi) where phi = the sampling fraction = n/N.
An examination of these equations illustrates the point that the standard error of the mean depends mainly on the size of the sample and only to a minor extent on the fraction of the population sampled.0July 8, 2002 at 4:46 pm #77038The recommendation of 40 comes from the fact that there is little difference in the density functions between a normal (or z) and a t distribution at sample sizes greater than 40.
The exact confidence intervals of the mean and standard deviations can be determined as described in Robert’s email.0July 8, 2002 at 4:53 pm #77039Charlie’s point is that “abasu” is essentially pulling that 40 sample size recommendation out of thin air.
Robert Butler is taking you down the correct path.
How precisely do you wish to be able to specify the distribution’s mean? This should be given in terms of width around the mean.0July 11, 2002 at 5:06 am #77126Be careful out there!
A run of 5,602 samples can tell you a lot about what went on in 2001. I bet this data ain’t Normal. I bet there all sorts of trends and jumps in there – and they are important in determining what you want to do now. Have you looked at a control chart of this data?
Why did you collect so many samples in 2001? So many per day or per shift?
Why are you looking to drastically reduce the sample size now? What is different now about what you want to know than when the decision was made to collect so much data?
Robert’s advice is sound if you are sure that you have a reasonably stable distribution, but I sense that there is more going on – and I wouldn’t assume stability without having closely looked at the control chart.
I hope that these questions are useful.
0July 11, 2002 at 6:25 am #77128Thank you very much, everyone for your answers. All those actually gave me some guidance.
In answer to Tim’s question, actually, what really happened was, the 5,602 samples that I said, was actually results of the work of our engineers. I said samples since we have not actually captured all their work. Now, I need to get some data out from the 5,602. I would like to just get the data from a certain number of samples from 5,602 to represent the 5,602. Thus my question of how to get a strong sample that would represent it.
I reallt hope this would clear my question up. I apologize for any inconvenience this may have caused you.0July 11, 2002 at 7:34 am #77131Hi rullen,
Sample size can be calculate depending on trhee parameters
1. Confidence 2. Margin of eror 3. Variance in case of continuous data proportiona of failure or acceptance in case of attribute this can be a guess from the past if you dont have any guess you can take it as 0.5 to get max sample size.
There is a formula for calculating sample size based on the above three factors.
I can send you a presentation on sampling methods and sample size and an excel template for calculating the sample size, if you are interested contact me at
[email protected]
thanks
sridhar
0July 11, 2002 at 12:11 pm #77141
Joe KnechtParticipant@JoeKnecht Include @JoeKnecht in your post and this person will
be notified via email.Do you know the difference that is really important to you?
Minitab can calculate sample sizes to achieve acceptable power if you know the standard deviation and the difference you need to detect:Stat>Power and Sample Size>1sample t
0July 11, 2002 at 1:02 pm #77145
Robert GoldhammerMember@RobertGoldhammer Include @RobertGoldhammer in your post and this person will
be notified via email.Try using the below website sample size calculator.
I have found it to be reliable.
https://www.isixsigma.com/offsite.asp?A=Fr&Url=http://www.surveyguy.com/SGcalc.htm0 
AuthorPosts
The forum ‘General’ is closed to new topics and replies.