Font Size
Featured How to Avoid The Evils Within Customer Satisfaction Surveys

How to Avoid The Evils Within Customer Satisfaction Surveys

When the Ritz-Carlton Hotel Company won the Malcolm Baldrige National Quality Award for the second time in 1999, companies across many industries began trying to achieve the same level of outstanding customer satisfaction. This was a good thing, of course, as CEOs and executives began incorporating customer satisfaction into their company goals while also communicating to their managers and employees about the importance of making customers happy.

When Six Sigma and other metrics-based systems began to spread through these companies, it became apparent that customer satisfaction needed to be measured using the same type of data-driven rigor that other performance metrics (processing time, defect levels, financials, etc.) used. After all, if customer satisfaction was to be put at the forefront of a company’s improvement efforts, then a sound means for measuring this quality would be required.

Enter the customer satisfaction survey. What better way to measure customer satisfaction than asking the customers themselves? Companies jumped on the survey bandwagon – using mail, phone, email, web and other survey platforms. Point systems were used (e.g., ratings on a 1-to-10 scale) which produced numerical data and allowed for a host of quantitative analyses. The use of the net promoter score (NPS) to gauge customer loyalty became a standard metric. Customer satisfaction could be broken down by business unit, department and individual employee. Satisfaction levels could be monitored over time to determine upward or downward trends; mathematical comparisons could be made between customer segments as well as product or service types. This was a CEO’s dream – and it seemed there was no limit to the customer-produced information that could help transform a company into the “Ritz-Carlton” of its industry.

In reality, there was no limit to the misunderstanding, abuse, wrong interpretations, wasted resources, poor management and employee dissatisfaction that would result from these surveys. Although some companies were savvy enough to understand and properly interpret their survey results, the majority of companies did not. This remains the case today.

What could go wrong with the use of customer satisfaction surveys? After all, surveys are pretty straightforward tools that have likely been used since the times of the Egyptians (pharaoh satisfaction levels with pyramid quality, etc.). Survey data, however, has a lot of potential issues and limitations that makes it different from other “hard” data that companies utilize. It is critical to recognize these issues when interpreting survey results – otherwise what seems like a great source of information can cause a company to do many bad things.

Survey Biases and Limitations

Customer satisfaction surveys are everywhere; customers are bombarded with email and online survey offers from companies who want to know what customers think about their products and services. In the web-based world, results from these electronic surveys can be immediately stored in databases and analyzed in a thousand different ways. In nearly all of these instances, however, the results are wrought with limitations and flaws. customer satisfactionThe most common survey problems include types of bias, variations in customer interpretations of scales and lack of statistical significance. These issues must be considered if sound conclusions are to be drawn from survey results.

Non-response Bias

Anyone who has called a credit card company or bank is likely to have been asked to stay on the line after their call is complete in order to take a customer satisfaction survey. How many people stay on the line to take that survey? The vast majority of people hang up as soon as the call is complete. But what if the service that a customer received on the phone call was terrible and the agent was rude? It is more likely that the customer would stay on the call and complete the survey at the end of the call. And that is a perfect example of the non-response bias at work.

Although surveys are typically offered to a random sample of customers, the recipient’s decision whether or not to respond to the survey is not random. Once a survey response rate dips below 80 percent or so, the inherent non-response bias will begin to affect the results. The lower the response rate, the greater the non-response bias. The reason for this is fairly obvious: the group of people who choose to answer a survey is not necessarily representative of the customer population as a whole. The survey responders are more motivated to take the time to answer the survey than the non-responders; therefore, this group tends to contain a higher proportion of people who have had either very good, or more often, very bad experiences. Changes in response rates will have a significant effect on the survey results. Typically, lower response rates will produce more negative results, even if there is no actual change in the satisfaction level of the population.

Survey Methodology Bias

The manner in which a customer satisfaction survey is administered can also affect the results. Surveys that are administered in person or by phone tend to result in higher scores than identical surveys distributed by email, snail mail or on the Internet. This is due to people’s natural social tendency to be more positive when there is another person directly receiving feedback (even if the recipient is an independent surveyor). Most people do not like to give another individual direct criticism, so responses tend to be more favorable about a product (or service, etc.) when speaking in person or by phone. Email or mail surveys have no direct human interaction and, therefore, the survey taker often feels more freedom to share negative feedback – criticisms are more likely to fly.

In addition, the manner in which a question is asked can have a significant affect on the results. Small changes in wording can affect the apparent tone of a question, which in turn can impact the responses and the overall results. For example, asking “How successful were we at fulfilling your service needs” may produce a different result than “How would you rate our service?” although they are similar questions in essence. Even the process by which a survey is presented to the recipient can alter the results – surveys that are offered as a means of improving products or services to the customer by a “caring” company will yield different outcomes than surveys administered solely as data collection exercises or surveys given out with no explanation at all.

Regional Biases

Another well-known source of bias that exists within many survey results is regional bias. People from different geographical regions, states, countries, urban vs. suburban or rural locations, etc. tend to show systematic differences in their interpretations of point scales and their tendencies to give higher or lower scores. Corporations that have business units across diverse locations have historically misinterpreted their survey results this way. They will assume that a lower score from one business unit indicates lesser performance, when in fact that score may simply reflect a regional bias compared to the locations of other business units.

Variation in Customer Interpretation and Repeatability of the Rating Scale

Imagine that your job is to measure the length of each identical widget that your company produces to make sure that the quality and consistency of your product is satisfactory. But instead of having a single calibrated ruler with which to make all measurements, you must make each measurement with a different ruler. This is not a problem if all the rulers are identical, but you notice that each ruler has its own calibration. What measures as one inch for one ruler measures 1¼ inches for another ruler, ¾ of an inch for a third ruler, etc. How well could you evaluate the consistency of the widget lengths with this measurement system if you need to determine lengths to the nearest 1/16 of an inch? Welcome to the world of customer satisfaction surveys.

Unlike the scale of a ruler or other instrument which remains constant for all measurements (assuming its calibration remains intact), the interpretation of a survey rating scale varies for each responder. In other words, the people who complete the survey have their own “calibrations” for the scale. Some people tend to be more positive in their assessments; other people are inherently more negative. On a scale of 1 to 10, the same level of satisfaction might solicit a 10 from one person but only a 7 or 8 from another.

In addition, most surveys exhibit poor repeatability. When survey recipients are given the exact same survey questions multiple times, there are often differences in their responses. Surveys rarely pass a basic gage R&R (repeatability and reproducibility) assessment. Because of these factors, surveys should be considered noisy (and biased) measurement systems – their results cannot be interpreted with the same precision and discernment as data that is produced by a physical measurement gauge.

Statistical Significance

Surveys are, by their very nature, a statistical undertaking and thus it is essential to take the statistical sampling error into account when interpreting survey data. Sample size is part of the calculation for this sampling error: if a survey result shows a 50 percent satisfaction rating, does that represent 2 positive responses out of 4 surveys or 500 positives out of 1,000 surveys? Clearly the margin of error will be different for those two cases.

There are undoubtedly thousands of examples of companies failing to take margin of error into account when interpreting survey results. A well-known financial institution routinely punished or rewarded its call center personnel based on monthly survey results – a 2 percent drop in customer satisfaction would solicit calls from executives to their managers demanding to know why the performance level of their call center was decreasing. Never mind that the results were calculated from 40 survey results with a corresponding margin of error of ±13 percent, making the 2 percent drop statistically meaningless.

An optical company set up quarterly employee performance bonuses based on individual customer satisfaction scores. By achieving an average score between 4.5 and 4.6 (based on a 1-to-5 scale), an employee would get a minimum bonus; if they achieved an average score between 4.6 and 4.7, they would get an additional bonus; and if their average score was above 4.7, they would receive the maximum possible bonus. As it turned out, each employee’s score was calculated from an average of less than 15 surveys – the margin of error for those average scores was ±0.5. All of the employees had average scores within this margin of error and, thus, there was no distinction between any of the employees. Differences of 0.1 points were purely statistical noise with no basis in actual performance levels.

When companies fail to take margin of error into account, they wind up making decisions, rewarding or punishing people, and taking actions based purely on random chance. As statistician W. Edwards Deming shared 50 years ago, one of the fastest ways to completely discourage people and create an intolerable work environment is to evaluate people based on things that are out of their control.

Proper Use of Surveys

What can be done? Is there a way to extract useful information about surveys without misusing them? Or should customer satisfaction surveys be abandoned as a means of measuring performance?

It is better not to use surveys at all then to misuse and misinterpret them. The harm that can be done when biases and margin of error are not understood is worse than the benefit of having misleading information. If the information from surveys can be properly understood and interpreted within their limitations, however, then surveys can help guide companies in making their customers happy. The following are some ways that can be accomplished.

Determine the Drivers of Customer Satisfaction and Measure Them

Customers generally are not pleased or displeased with companies by chance – there are drivers that influence their level of satisfaction. Use surveys to determine what those key drivers are and then put performance metrics on those drivers, not on the survey results themselves. Ask customers for the reasons why they are satisfied or dissatisfied, then affinitize those responses and put them on a pareto chart. This information will be more valuable than a satisfaction score, as it will identify root causes of customer happiness or unhappiness on which measurements and metrics can then be developed.

For example, if it can be established that responsiveness is a key driver in customer satisfaction then start measuring the time between when a customer contacts the company and when the company responds. That is a hard measurement and is more reliable than a satisfaction score. The more that a company focuses on improving the metrics that are important to the customer, the more likely that company will improve real customer satisfaction (which is not always reflected in biased and small-sample survey results).

Improve Your Response Rate

If survey results should reflect the general customer population (and not a biased subset of customers) then there must be a high response rate to minimize the non-response bias. Again, the goal should be at least an 80-percent response rate. One way to achieve this is to send out fewer surveys but send them to a targeted group that has been contacted ahead of time. Incentives for completing the survey along with reminder messages can help increase the response rate significantly.

Making the surveys short, fast and painless to complete can go a long way toward improving response rates. As tempting as it may be to ask numerous and detailed questions to squeeze every ounce of information possible out of the customer, a company is likely to have survey abandonment when customers realize the survey is going to take longer than a few minutes to complete. A company is better off using a concise survey that is quick and easy for the customers to complete. Ask a few key questions and let the customers move on to whatever else they need to attend to; the company will end up with a higher response rate.

Do Not Make Comparisons When Biases Are Present

A lot of companies use customer survey results to try to score and compare their employees, business units, departments, and so on. These types of comparisons must be taken with a grain of salt, as there are too many potential biases that can produce erroneous results. Do not try to compare across geographic regions (especially across different countries for international companies), as the geographic bias may lead to the wrong conclusions. If the business is a national or international company and wishes to sample across a large customer base, use stratified random sampling so that the customers are sampled in the same geographic proportion that is representative of the general customer population.

Also, do not compare results from surveys that were administered differently (phone versus mail, email, etc.) – even if the survey questions were identical. The survey methodology can have a significant influence on the results. Be sure that the surveys are identical and are administered to customers using the exact same process.

Surveys are rarely capable of passing a basic gage R&R study. They represent a measurement system that is noisy and flawed; using survey results to make fine discernments, therefore, is usually not possible.

Always Account for Statistical Significance in Survey Results

This is the root of the majority of survey abuse – where management makes decisions based on random chance rather than on significant results. In these situations Six Sigma tools can be a significant asset as it is critical to educate management on the importance of proper statistical interpretation of survey results (as with any type of data).

Set a strict rule that no survey result can be presented without including the corresponding margin of error (i.e., the 95 percent confidence intervals). For survey results based on average scores, the margin of error will be roughly



where ? is the standard deviation of the scores and n is the sample size. (Note: For sample sizes <30, the more precise t-distribution formula should be used.) If the survey results are based on percentages rather than average scores, then the margin of error can be expressed as



where p is the resulting overall proportion (note that the Clopper-Pearson exact formula should be used if np < 5 or (1-np) < 5). Mandating that a margin of error be included with all survey results helps frame results for management, and will go a long way in getting people to understand the distinction between significant differences and random sampling variation.

Also, be sure to use proper hypothesis testing when making survey result comparisons between groups. Use the following tools as appropriate for the specific scenario:

  • For comparing average or median scores, there are t-tests, analysis of variance, or Mood’s Median tests (among others).
  • For results based on percentages or counts there are proportions tests or chi-squared analysis.

If comparing a large number of groups or looking for trends that may be occurring over time, the data should be placed on the appropriate control chart. Average scores should be displayed on an X-bar and R, or X-bar and S chart, while scores based on percentages should be shown on a P chart. For surveys with large sample sizes, an I and MR chart may be more appropriate to account for variations in the survey process that are not purely statistical (such as biases changing from sample to sample, which is common). Control charts go a long way in preventing management overreaction to differences or changes that are statistically insignificant.

Finally, make sure that if there goal or targets are being set based on customer satisfaction scores, those target levels must be statistically distinguishable based on margin of error. Otherwise, people are rewarded or punished based purely on chance. In general, it is always better to set goals based on the drivers of customer satisfaction (the hard metrics) rather than on satisfaction scores themselves. Regardless, the goals must be set as statistically significantly different from the current level of performance.


Customer satisfaction surveys are bad, evil things. OK, that’s not necessarily true, but surveys do have a number of pitfalls that can lead to bad decisions, wasted resources and unnecessary angst at a company. The key is to understand survey limitations and to not treat survey data as if it were precise numerical information coming from a sound, calibrated measurement device. The best application of customer surveys is to use them to obtain the drivers of customer happiness or unhappiness, then create the corresponding metrics and track those drivers instead of survey scores. Create simple surveys and strive for high response rates to assure that the customer population is being represented appropriately. Do not use surveys to make comparisons where potential biases may lie, and be sure to include margin of error and proper statistical tools in any analysis of results.

Used properly, customer satisfaction surveys can be valuable tools in helping companies understand their strengths and weaknesses, and in helping to identify areas of emphasis and focus in order to make customers happier. Used improperly, problems ensue. Make sure your company follows the right path.

Register Now

  • Stop this in-your-face notice
  • Reserve your username
  • Follow people you like, learn from
  • Extend your profile
  • Gain reputation for your contributions
  • No annoying captchas across site
And much more! C'mon, register now.

Leave a Comment

You must be logged in to post a comment.


Mike K.

Just from experience – I was recently victim of a survey which had the gall to not only expose me to over 50 questions: One of them was a multiple choice question with over 1000(!) entries, sorted neither thematically nor alphabetically!

And then, the multiple choice answers (which I picked more or less random) were used to create a comparison matrix, resulting in a 20×5 matrix where the surveyors seriously expected me to fill a Likert Value (1-5) in each of these 100 records.

While I have no idea who on earth actually takes the time to complete this survey, I would be very, very, very interested how they attribute statistical significance to the results they obtain.

I agree with your article that the described steps MUST be taken into account when evaluating surveys – but I dare go even one step further: Survey Design itself must plan for these factors prior to bombarding the customer with the survey.

The best surveys are minimally intrusive on the customer while providing maximum levels of statistical significance – the worst surveys are highly intrusive with no significance at all.



Simply brilliant. Thank you for the contribution.

Vik SIdhu

Great primer on how to and not to conduct a survery.

Robert Ballard

Excellent article. Whenever I am on a call with customer service I’m often asked if I would like to take a follow-up survey., typically prior to the actual communication with the agent. I always indicate that I will take the survey thinking I may get better service with the thought that it’s possible that the agent knows that I’ll be participating in the survey. This is another example of bias as presented in the article. Great job.


Thanks for the article Rob. There are many in the healthcare industry that could benefit from your suggestion of analyzing and improving the driving factors of patient satisfaction, rather than spending so much energy on the results of the surveys themselves.


Irealy like this topic becuase Ima working with so many peaople
l who always bais and complain on our offfice.


Your article has made me realize that I have not considered some key issues when pushing for customer surveys. Many thanks for sharing your knowledge.

Chris Seider

Nicely written.


Can non response bias be estimated?

Rob Brogle

Actually, it can be estimated. You’ll need to take a random sampling of the non-responders and resend the survey to them, but this time use follow-ups, incentives, hassling techniques, threats, etc. to get them to respond (okay, don’t threaten them too much but you get the point). You’ll need to get about 80% response rate from this follow up survey, or else you will have non-response bias within your non-responders follow-up survey (ugh).

Then you can see what the delta (d) is between the two groups: for surveys measuring average scores (x-bar) the delta will be:

d = x-bar(non responders) – x-bar(responders)

For surveys measuring proportions (p) the delta will be:

d = p(non-responders) – p(responders)

Then you can actually adjust for the non response bias and estimate the true, population results. If R is the response rate of the original survey, then the population estimate will be:

Average scores: x-bar(population) = x-bar(responders) + (1-R)d
Proportions: p(population) = p(responders) + (1-R)d

Obviously, there will be a significant effort involved in getting the response rate of the follow-up survey to the 80% level, particularly given the fact that this group of people already have an inclination not to respond to surveys. And of course there’s the possibility that by hassling or incentivizing these folks, you may introduce additional bias into this follow-up survey.

So of course the best thing to do is to try and get a high response rate in the initial survey so that you don’t have to worry about this stuff (and then spend a lot of effort trying to account for it).

Kaizen TQM

Yes you are absolutely right.I am Completely agree with you.The Customer needs a very prominent service to get satisfied.if we provide the Proper attention the customer get satisfied.for that we need to make some changes or we need to apply some new Techniques such as Lean, Kaizen, Kaizen TQM etc


Survey HA HA HA ! It is ridiculous Customer Services is abuse from “customer”. It is very sad how employees worked hard under pressure in the pay is only penny’s. RATE A CUSTOMER ABUSE IN HORRIBLE POOR COMUNICATION!

Mike Morges

Hi Rob, very interesting article, thank you. I have 3 questions:

1. How would you measure the margin of error of a customer satisfaction metric that uses a mean score (e.g. 7.6, scale: 1 to 10). Do you apply the same formula?

2. Setting targets is always tricky. How would you set a target for a customer satisfaction metric? I assume you can’t just multiple your baseline by +3% or 10% (not very scientific…)? Instead, and from your article, would you set a target that is ‘significant’ to reach?

3. And finally how do you know when your metric is beating your ‘targets’ with a confidence level of 95%? (do you have to factor in 2 margin of errors, one for the target and one for actuals?)

Thank you

Rob Brogle

Hey Mike,

Thanks for the comment. My response to your questions are below:

1. If your sample size is large (>30) then you can use the formula stated in the article. If your sample size is smaller then that, then you should use the t-distribution formula for calculating 95% confidence intervals of the mean. I just now tried to type it in here, but without a math font it’s pretty much unreadable. However, you can find on Google quite easily if needed. Now these formulas assume that the data is more or less normally distributed. If, instead, your data is highly skewed (which is often the case for survey data), then it’s better to use median scores instead of means. In that case, use the confidence interval formula for medians (which you can also find using Google).

2. Be sure that your performance targets are outside the confidence intervals of your baseline data. This is important–if your targets are set within the confidence intervals then you can hit or miss them based purely on chance. I’ve seen many cases where the maximum value on the scale (e.g. 10 in a 1-10 scale) falls within the confidence intervals of the baseline data. This indicates that the sample size is too small to distinguish any improvement in customer satisfaction.

3. If you have hit your target, run a 2-sample t test on the “before” and “after” data to determine whether or not the improvement is statistically significant. If you are using median scores instead of means, then run a Mood’s Median test. A p-value less than 0.05 indicates that you can be more than 95% certain that the target was reached due to a real improvement in scores and not due to a statistical fluctuation of the data.

Hope that helps–let me know if you have any additional questions…

— Rob

Rob Brogle

Just a quick addition about setting targets: I would highly recommend setting performance targets based on the drivers of customer satisfaction rather than the satisfaction scores themselves. Putting targets on things like quality of work, response time to customers, problem resolution time, etc., will be much more measurable and reliable and will go significantly farther in driving the kinds of behaviors that you are looking for to make your customers happier…


My company sends a satisfaction survey to customers who have used our services. Out of approximately 300 customers who I helped, 15 responded to the survey. With a sample of this size, would the outcome be statistically significant?

To take it a step further, the survey uses a top-box approach, with the goal of 89.5 of respondents rating their overall experience as a 5. So if 2 respondents out of 15 do not award a 5, the performance goal isn’t met. Any thoughts on the merit of a score given by 2 respondents out of a population of 300?

Rob Brogle

Actually, you have two issues here: (1) very small sample size and (2) very low response rate. Let’s look at the first issue:

At a sample size of 15, your 2 out of 15 top-box responses give a sample proportion of 13% but the confidence intervals for the “true” population proportion are between 2% and 40%. So there is a huge uncertainty there due to the sample size of only 15 (although it seems that even at the high end of the uncertainty you are still well under the 89.5% goal). Of course, this all assumes that those 15 responses actually represent the population (the 300 people that you serviced). Which leads us to the second issue.

The second issue to me is the bigger problem. At a response rate of 5%, there is a high likelihood of a non-response bias. Only 15 out of 300 people were motivated to answer the survey and so it’s unlikely that those responses are representative of the entire 300. If the motivated 15 had a higher proportion of unsatisfied folks than the “silent majority” of non-responders (often the responders have a negative bias compared to the non-responders), then you wind up getting penalized because of this bias.

Unless response rates are high (> 80%) and statistical uncertainty is taken into account, survey results can be very misleading and can lead to bad decisions, unfair evaluations, and all kinds of other nasty things. It is much, much better to evaluate people based on the concrete things that we know drive customer satisfaction and loyalty: fast responses, short problem resolution times, high quality of service (as defined by specific actions), etc. These attributes CAN be measured accurately and improving those attributes WILL make customers more happy (although this increased happiness may well be missed on a small-sample and/or low response-rate survey).

I think sometimes it’s easy for leadership at a company to throw out surveys as a means of evaluation without really thinking about what they’re doing (and how the misleading results are hurting the employees). This is unfortunate, but also very, very common.

Hope that helps…

Norbert Feher


Thank You.

Some very interesting points and I also recommend the readers to have a look at the 2nd and 3rd chapter in Darrel Huff’s book How to lie with statistics.

Thanks again!


Satisfaction data generally tends to be skewed, with most responses leaning towards the high end of the scale. We are finding this to be the case with a few satisfaction surveys (some of which are ongoing tracking surveys). Do you suggest log transformation before doing satisfaction driver analysis using regression? Multiple regression does assume normality. We have been doing it without any transformation. Please share your thoughts on this. Thanks.

Rob Brogle

As you pointed out, survey data is often skewed (non-normal) and also strictly-speaking we are dealing with ordinal, discrete data and not continuous data which is required for the type of multiple regression analysis that you are talking about. If the survey responses are on a 10-point scale then it might be okay to “cheat” and consider the data to be continuous, but certainly for a 7, 5, or lower point scale then the assumption of continuous data is typically not a good one. Also, the different predictors (questions) on which you are trying to run the regression usually have significant multi-collinearity which again does not lend itself well to a multiple regression. And running transformation just further complicates the interpretation of the results even if you are able to transform to “normal” data.

For those reasons, personally I would try to keep things much more simple and analyze the data using measures of association which are designed to find relationships between ordinal, discrete data. For example, running a Pearson correlation between your questions would determine which questions show statistically significant correlations (p-values) and will give the strength of the relationship (Pearson’s r). This is very straightforward in Minitab (Stat>Basic Statistics>Correlation) although I imagine that JMP or any other statistical software will have this built-in analysis as well.

As always, keep in mind that survey data is usually very noisy and also potentially biased (particularly if you have a low response rate). So even if you find the “perfect” analysis your conclusions could be inaccurate. If you are trying to find the drivers of customer satisfaction then rather than looking for correlations between different questions and the “overall satisfaction” rating I would recommend asking your customers to rate the importance of your different performance aspects. In other words, ask them to rate the importance of, say, responsiveness, quality of service, reliability of product, ease of use, etc. This will at least give you a direct measure of what you are looking for instead of relying on correlations or regressions to find drivers of overall satisfaction. Again, this will be subject to the usual noise and biases so you’ll need a high response rate to feel confidence about the solutions.

I think the bottom line is that no matter how “sophisticated” the analysis, you are typically dealing with noisy and biased data from surveys. That needs to be at the forefront of everyone’s mind as you try to draw conclusions. Hopefully, you’ll be able to draw some useful, general information about what makes people happy or unhappy but you need to be careful to not be misled. Thus if you have any other “hard” customer data such as complaint data, warranty data, attrition data, etc. then those should be analyzed very thoroughly so that you are not strictly relying on surveys to get information about what your customers are looking for.

Hope that helps…

clyde woodard

I have just concluded my business transaction and your help was valuable to my success.


Hey Rob – Under your section Survey Methodology Bias you state: “The manner in which a customer satisfaction survey is administered can also affect the results.”

Can you advise where I might be able to find more information about this? I am trying to figure out if surveys given to customers by their direct points of contact within a company might be biased versus surveys sent from a neutral generic company email. Intuitively I would say yes. But I wondered if I was right and if there is any research on this. Can you shed light on this?


Rob Brogle

Hey Mike,

There is a lot of information out there about constructing surveys but I don’t know off the top of my head of any studies comparing surveys administered by direct points of contact versus a neutral third party. There has been plenty of research showing that people don’t like to let down folks to whom they have a relationship and so “leading questions” that imply a desired response do, in fact, increase the percentage of desired responses. My guess would be the same as yours: that having familiar points of contact administer the survey would result in higher scores than having a neutral third party administer the survey. But I don’t have a reference with data to verify that assumption.


With NPS questions (how likely am I to recommend…) I always answer 0 or “not at all likely” simply because I don’t normally recommend Telcos or Skin Care products to anybody. This response almost always asks for further comment to explain my negative answer. I wonder what the +/- statistical impact is of people finding the survey tedious when all they want to know really is whether it is good or bad and why. I also wonder how many people would rate something honestly as a 10 or excellent.

Mike Bruss

I work at a company that uses a 3-6-12 month average customer service score calling 22/1300 possible candidates a month and requires that 85% of these customers are 5/5 in regards to service with a 4 or less counting as a negative score. They also call the same amount of people from each location, regardless of transaction count, meaning that some locations only do 250-300 transactions, but still have 22 samples that are used in their customer service survey. I am no statistics major but I just don’t see that as an apples to apples comparison, nor do I believe that 22/1300 is an accurate representation of the customer base especially when locations are located in several different areas in regards to affluence. Also, the majority of locations missing on the 85% mark are getting several 4/5’s as opposed to the 5/5’s required for the survey to elicit a positive score. I am curious to know whether or not this type of survey is a accurate representation of customer service or if this is an inaccurate system. Let me know if you need any more information.

Rob Brogle

Hey Mike,

As you suspect, there are a number of issues with this survey approach. The fact that the company uses the same sample size for regions having differing transaction totals is okay, but there are a number of other concerns with the survey system you have described:

1. Sample size. With only 22 samples from each region, there is enormous uncertainty about the population percentage of customers giving a “5” rating. For example, if a region has 19 out of the 22 surveys where the customer rated a “5” (which comes out to 86%), then the true population percentage of “5” ratings for that region could be anywhere between 65% and 97%. So even though your sample percentage was 86%, you can’t claim that that the population percentage for that region is also 86%–it could be anywhere within that 65-97% confidence interval range.

2. Only counting “5” ratings as successes. A lot of organizations due this–the customers giving a “5” are considered “promoters” who are likely to recommend your company to others. So companies focus only on increasing promoters and as a consequence they consider a “4” rating (or any other rating) to be a failure. However when evaluating regions, your company is treating a region who had 10 out of 22 surveys rating a “5” and the rest of the ratings were a “1” the same as a region with 10 out of 22 surveys rating a “5” and the rest of the surveys rated a “4”. So on top of the statistical uncertainty there is also a loss of information about non “5” raters which can make regions seem similar that are, in fact, very different.

3. Response rate. If the response rate is low then the surveys may have significant non-response bias, meaning that the survey respondents are not necessarily representative of the population. So my question would be: how many customers were called in order to get 22 responses? If your company called, say, 25 people and got 22 responses, then you would have minimal non-response bias. However, if your company had to call hundreds of people to get 22 responses, then the non-response bias could be significant.

4. Population representation. If there are diverse subgroups of customers within each region, then the company must use a “stratified random” sampling approach in order to assure that the 22 samples fairly represent the population of that region. That requires sampling from each subgroup in proportion to the population size of that subgroup. If this is not done, then subgroups can be over-represented or under-represented in the survey results, which means that the results are NOT fairly representing the population of the region.

Based on all of these issues (particularly the low sample size issue), I suspect that your regions are being rewarded or punished primarily due to chance having little to do with their “true” performance with the customers. Only very large differences (outside of the confidence intervals) should be considered significant and the rest should be understood as sampling fluctuations. Most companies don’t understand this and inflict all kinds of bad things onto their staff as a result. W. Edwards Deming ran his famous “Red Bead Experiment” to address this particular type of management misuse of data. It’s far better to not do any customer surveys at all rather than randomly reward or punish people as a result of not properly understanding and interpreting the results.


I sent this article to my store manager who is in the process of threatening every shift to write everyone up based on the “no’s” we get from random surveys. I work at CVS, and have seen the disparaging behavior in which the company takes too much emphasis on its surveys and has decided to punish its employees and base their performance on them. We have been accused of not doing our jobs and have been threatened down-right by management if we get a no, we will get written up and essentially fired. It’s amusing because after I sent some positive feedback to the rest of the store to break the negative trend our management is trying to set, our district manager suddenly appears inside the store. I, and other employees feel like a threat ourselves to management, that anything they don’t find acceptable they will punish us for. I feel like we are becoming divided, and are being told to discard our moral compass in favor of following tyrannical demands and threats. The workplace has become a stressful and non-conducive environment, it’s terrible and disappointing to see how some people can be to their employees… I only hope businesses who really do care won’t repeat these mistakes. Keep power hungry people from power, thats the best filter we can have at the moment.

Peter OBrien

Just curious, when did you write this article? Very nicely done, by the way.


Rob Brogle

Thanks, Pete. This was written back in September of 2013.


You forgot to mention most of these surveys are used to find and fire people that are dissenters. Usually dissenters (unhappy employees) are silenced by fear of unemployment. So the exceptions can be fired easily enough. The surveys are never used to change anything in the company.

5S and Lean eBooks

Six Sigma Online Certification: White, Yellow, Green and Black Belt

Six Sigma Statistical and Graphical Analysis with SigmaXL
Six Sigma Online Certification: White, Yellow, Green and Black Belt
Lean and Six Sigma Project Examples
GAGEpack for Quality Assurance
Find the Perfect Six Sigma Job

Login Form