A survey is a valuable assessment tool in which a sample is selected and information from the sample can then be generalized to a larger population. Surveying has been likened to taste-testing soup – a few spoonfuls tell what the whole pot tastes like.
The key to the validity of any survey is randomness. Just as the soup must be stirred in order for the few spoonfuls to represent the whole pot, when sampling a population, the group must be stirred before respondents are selected. It is critical that respondents be chosen randomly so that the survey results can be generalized to the whole population.
How well the sample represents the population is gauged by two important statistics – the survey’s margin of error and confidence level. They tell us how well the spoonfuls represent the entire pot. For example, a survey may have a margin of error of plus or minus 3 percent at a 95 percent level of confidence. These terms simply mean that if the survey were conducted 100 times, the data would be within a certain number of percentage points above or below the percentage reported in 95 of the 100 surveys.
In other words, Company X surveys customers and finds that 50 percent of the respondents say its customer service is “very good.” The confidence level is cited as 95 percent plus or minus 3 percent. This information means that if the survey were conducted 100 times, the percentage who say service is “very good” will range between 47 and 53 percent most (95 percent) of the time.
Survey | Margin |
2,000 | 2 |
1,500 | 3 |
1,000 | 3 |
900 | 3 |
800 | 3 |
700 | 4 |
600 | 4 |
500 | 4 |
400 | 5 |
300 | 6 |
200 | 7 |
100 | 10 |
50 | 14 |
*Assumes a 95% level of confidence |
Sample Size and the Margin of Error
Margin of error – the plus or minus 3 percentage points in the above example – decreases as the sample size increases, but only to a point. A very small sample, such as 50 respondents, has about a 14 percent margin of error while a sample of 1,000 has a margin of error of 3 percent. The size of the population (the group being surveyed) does not matter. (This statement assumes that the population is larger than the sample.) There are, however, diminishing returns. By doubling the sample to 2,000, the margin of error only decreases from plus or minus 3 percent to plus or minus 2 percent. Although a 95 percent level of confidence is an industry standard, a 90 percent level may suffice in some instances. A 90 percent level can be obtained with a smaller sample, which usually translates into a less expensive survey. To obtain a 3 percent margin of error at a 90 percent level of confidence requires a sample size of about 750. For a 95 percent level of confidence, the sample size would be about 1,000.
Determining the margin of error at various levels of confidence is easy. Although the statistical calculation is relatively simple – the most advanced math involved is square root – margin of error can most easily be determined using the chart below. A few websites also calculate the sample size needed to obtain a specific margin of error. Thus, if the researcher can only tolerate a margin of error of 3 percent, the calculator will say what the sample size should be.
Calculating Margin of Error for Individual Questions
Margins of error typically are calculated for surveys overall but also should be calculated again when a subgroup of the sample is considered. Some surveys do not require every respondent to receive every question, and sometimes only certain demographic groups are analyzed. If only those who say customer service is “bad” or “very bad” are asked a follow-up question as to why, the margin of error for that follow-up question will increase because the number of respondents is smaller than the overall survey sample. Similarly, if results from only female respondents are analyzed, the margin of error will be higher, assuming females are a subgroup of the population.
Survey Data Is Imprecise
Margin of error reveals the imprecision inherent in survey data. Survey data provide a range, not a specific number. A researcher surveying customers every six months to understand whether customer service is improving may see the percentage of respondents who say it is “very good” go from 50 percent in one period to 47 percent in the next six-month period. Both are accurate because they fall within the margin of error. The decrease is not statistically significant. On the other hand, if those percentages go from 50 percent to 54 percent, the conclusion is that there is an increase in those who say service is “very good” albeit a small one.
The Dark Side of Confidence Levels
A 95 percent level of confidence means that 5 percent of the surveys will be off the wall with numbers that do not make much sense. Therefore, if 100 surveys are conducted using the same customer service question, five of them will provide results that are somewhat wacky. Normally researchers do not worry about this 5 percent because they are not repeating the same question over and over so the odds are that they will obtain results among the 95 percent. However, if the same question is asked repeatedly such as a tracking study, then researchers should beware that unexpected numbers that seem way out of line may come up. For example, customers are asked the same question about customer service every week over a period of months, and “very good” is selected each time by 50 percent, then 54 percent, 52 percent, 49 percent, 50 percent, and so on. If 20 percent surfaces in another period and a 48 percent follows in the next period, it is probably safe to assume the 20 percent is part of the “wacky” 5 percent, assuming proper methodology is followed.
Thank you for putting Statistics into laymen terms. This is my first course in Biostatistics and I feel like I am learning a new language.
this is very easy to understand
what should be the ideal sample size and margin of error for a population of 481
Thanks. This is very useful and easy to understand too.
Hi! Clear explanations – well done! But a question: what if I achieved a high response rate and that my survey sample is close to the overall population size? or when populations are small as well (e.g., people with a disability)? Thanks f
Great explanation, clearly written and well appreciated. Nice to see someone explain a concept simply without trying to write a scientific paper. Plain English. What a wonderful concept.
Just an FYI, this sentence isn’t really accurate:
“These terms simply mean that if the survey were conducted 100 times, the data would be within a certain number of percentage points above or below the percentage reported in 95 of the 100 surveys.”
This implies that the sample percentages would be within the margin of error 95% of the time. That’s not quite right. It should be:
“These terms simply mean that if the survey were conducted 100 times, the actual percentages of the larger population would be within a certain number of percentage points above or below the percentage reported in 95 of the 100 surveys.”
Meaning that the actual population percentages should fall within the margin of error 95% of the time.
I don’t understand how the margin of error calculation doesn’t take the population size into consideration. I mean if I took a sample of 1000 from a population of 2000 I would think the results would have a smaller margin of error than if I took a sample of 1000 from a population of seven billion. Right? But that doesn’t seem to be the case and I can’t get my head around why that is so.
You actually wrote this in a way easy to understand. Thank you!
Hi,
I am conducting a household survey in Nepal. I am a bit confused in regards to confidence level and conficence interval. My query is does the total of confidence level and confidence interval be 100%? If one is 95%, then is it that the other is 5%? Can we determine sample size with 99% confidence level and 5% confidence interval?
Regards
Deepak
I feel like I am missing something. If the total population is either 10,000 or 1 million, will a sample of 2000 yield 2% margin @ 95% confidence in both cases?