I have a population of 660 and I’m trying to test a sample for a number of things, all of which give me success/failure (yes/no) results. After testing 35 randomly, my results show that most of the tests have very low success rate, 1 or 2 out of 35. How do I know if this sample is large enough to be representative of my population and I can I prove it (i.e. what equations are needed) with 95% confidence?
You have provide most of what is needed to calculate a valid sample size:
Population – 660
Expected defective rate – 2/35
Confidence Level – 95%
What is needed is the desired precision rate +/- some % maybe 1 or 2 or 5. Once you have that, you can use the standard discrete sample size calculator but because of the small population don’t forget to use the Finite Population Correction factor. Post the desired precision rate and I will be glad to run off the numbers.
Thanks for getting back to me:
Assume I want +/- 5%. Are there equations for solving for this and if so could you send them or let me know where I can find them?
Also, I’ve come across similar calculators online and I was unsure whether to use the results because I’m not sure that my data is normally distributed. How can I check this given that my data is all yes/no responses?
Clearly I’m not a statistics expert so any help you could give me would be great. Thanks.
There are some simple discrete calculators online. Just use a search engine to find them. The only problem is that most of them assume a large population which you don’t have. You would then have to find the Finite Population Correction factor and adjust for it. I have a calculator and the solution for a population of 660, a projected defective rate of 2/35, a confidence level of 95% and a precision of 5% would yield a sample size of
Enter Population Size Here*
Enter Population defective rate (p)*
*p must be between 0 and 1, if unknown
*includes the finite population correction factor
In Minitab under “Power and Sample Size for 1 Proportion”, you have 4 entries. They are Sample Sizes, Alternative Values of P, Power, and Hypothesized P. I am getting different numbers than you do. For eaxmple, alpha = .01, for Hypothesized P = .06, for Power = .90, alternative p = .01, “not equal”,
Is this because of the power factor that your spread sheet does not have? In other words, the way Montgomery’s book calculates sample size versus Minitab. Thank you in advance.
I’m not a statistician, but I believe the difference is that the Power and Sample Size feature in Minitab is calculating the sample size needed for testing a hypothesis at given levels of significance, power of test, and minimum difference. This is not the same as the estimation the original poster is looking to do.
What the original poster is trying to do is determine the minimum sample size needed to be 95% confident that the point estimate for p will be in error either way by less than 0.05. This is what the table Darth posted is providing.
The Minitab function for determining sample size is in the context of hypothesis testing so different variables and assumptions in the formula. For estimation, which is what the poster inquired about, I used the standard discrete sample size calculation with the fpc built in. In retrospect, he indicated a desire for a precision level of .05 with an estimated proportion of .06. That might be a bit high. I would suggest possibly a precision of .01.
The only time I have found consistency between what the Mini function computes and what the estimation formulas come up with is for continuous data. If you use the 1 sample Z and use power of .5 and plug in s.d. and precision, you will get the same answer but without the correction factor.
Thanks for your help. Quick follow-up question. To get my confidence level of 95%, I’m assuming you use a z value of 1.96 which corresponds to an alpha of .05 and so an alpha/2 of .025. What do I do if one of my tests gives 0 out of 35 success? Would I use a one-tailed test and a z value corresponding to an alpha of .05? What should I use as a probability because although 0 of 35 returned no successes, it doesn’t mean that the same will be true for all 660 in the entire population.
Also, is this how, for example, they determine the number of people to poll regarding approval/disapproval for a politician or is that a different technique?
Here is what I assume you are trying to do:
Given a population of 660, you would rather take a sample than investigate 100% of the population. You want to be 95% confident that you capture the true population proportion defective +/- 1 or 5%. You have some previous estimate of the proportion defective that indicates it to be approximately 6%. You can now calculate the needed sample size. You will then randomly select those from your population. You will then compute the overall proportion defective of the sample. Your statistical statement will be, “I am 95% confident that the true population proportion defective is X (calculation from your sample) plus or minus 5(1)%.”
The forum ‘General’ is closed to new topics and replies.