How to Avoid The Evils Within Customer Satisfaction Surveys

When the Ritz-Carlton Hotel Company won the Malcolm Baldrige National Quality Award for the second time in 1999, companies across many industries began trying to achieve the same level of outstanding customer satisfaction. This was a good thing, of course, as CEOs and executives began incorporating customer satisfaction into their company goals while also communicating to their managers and employees about the importance of making customers happy.

When Six Sigma and other metrics-based systems began to spread through these companies, it became apparent that customer satisfaction needed to be measured using the same type of data-driven rigor that other performance metrics (processing time, defect levels, financials, etc.) used. After all, if customer satisfaction was to be put at the forefront of a company’s improvement efforts, then a sound means for measuring this quality would be required.

Enter the customer satisfaction survey. What better way to measure customer satisfaction than asking the customers themselves? Companies jumped on the survey bandwagon – using mail, phone, email, web and other survey platforms. Point systems were used (e.g., ratings on a 1-to-10 scale) which produced numerical data and allowed for a host of quantitative analyses. The use of the net promoter score (NPS) to gauge customer loyalty became a standard metric. Customer satisfaction could be broken down by business unit, department and individual employee. Satisfaction levels could be monitored over time to determine upward or downward trends; mathematical comparisons could be made between customer segments as well as product or service types. This was a CEO’s dream – and it seemed there was no limit to the customer-produced information that could help transform a company into the “Ritz-Carlton” of its industry.

In reality, there was no limit to the misunderstanding, abuse, wrong interpretations, wasted resources, poor management and employee dissatisfaction that would result from these surveys. Although some companies were savvy enough to understand and properly interpret their survey results, the majority of companies did not. This remains the case today.

What could go wrong with the use of customer satisfaction surveys? After all, surveys are pretty straightforward tools that have likely been used since the times of the Egyptians (pharaoh satisfaction levels with pyramid quality, etc.). Survey data, however, has a lot of potential issues and limitations that makes it different from other “hard” data that companies utilize. It is critical to recognize these issues when interpreting survey results – otherwise what seems like a great source of information can cause a company to do many bad things.

Survey Biases and Limitations

Customer satisfaction surveys are everywhere; customers are bombarded with email and online survey offers from companies who want to know what customers think about their products and services. In the web-based world, results from these electronic surveys can be immediately stored in databases and analyzed in a thousand different ways. In nearly all of these instances, however, the results are wrought with limitations and flaws. customer satisfactionThe most common survey problems include types of bias, variations in customer interpretations of scales and lack of statistical significance. These issues must be considered if sound conclusions are to be drawn from survey results.

Non-response Bias

Anyone who has called a credit card company or bank is likely to have been asked to stay on the line after their call is complete in order to take a customer satisfaction survey. How many people stay on the line to take that survey? The vast majority of people hang up as soon as the call is complete. But what if the service that a customer received on the phone call was terrible and the agent was rude? It is more likely that the customer would stay on the call and complete the survey at the end of the call. And that is a perfect example of the non-response bias at work.

Although surveys are typically offered to a random sample of customers, the recipient’s decision whether or not to respond to the survey is not random. Once a survey response rate dips below 80 percent or so, the inherent non-response bias will begin to affect the results. The lower the response rate, the greater the non-response bias. The reason for this is fairly obvious: the group of people who choose to answer a survey is not necessarily representative of the customer population as a whole. The survey responders are more motivated to take the time to answer the survey than the non-responders; therefore, this group tends to contain a higher proportion of people who have had either very good, or more often, very bad experiences. Changes in response rates will have a significant effect on the survey results. Typically, lower response rates will produce more negative results, even if there is no actual change in the satisfaction level of the population.

Handpicked Content:   Get the Bugs Out: VOC Analysis and Scenario Planning

Survey Methodology Bias

The manner in which a customer satisfaction survey is administered can also affect the results. Surveys that are administered in person or by phone tend to result in higher scores than identical surveys distributed by email, snail mail or on the Internet. This is due to people’s natural social tendency to be more positive when there is another person directly receiving feedback (even if the recipient is an independent surveyor). Most people do not like to give another individual direct criticism, so responses tend to be more favorable about a product (or service, etc.) when speaking in person or by phone. Email or mail surveys have no direct human interaction and, therefore, the survey taker often feels more freedom to share negative feedback – criticisms are more likely to fly.

In addition, the manner in which a question is asked can have a significant affect on the results. Small changes in wording can affect the apparent tone of a question, which in turn can impact the responses and the overall results. For example, asking “How successful were we at fulfilling your service needs” may produce a different result than “How would you rate our service?” although they are similar questions in essence. Even the process by which a survey is presented to the recipient can alter the results – surveys that are offered as a means of improving products or services to the customer by a “caring” company will yield different outcomes than surveys administered solely as data collection exercises or surveys given out with no explanation at all.

Regional Biases

Another well-known source of bias that exists within many survey results is regional bias. People from different geographical regions, states, countries, urban vs. suburban or rural locations, etc. tend to show systematic differences in their interpretations of point scales and their tendencies to give higher or lower scores. Corporations that have business units across diverse locations have historically misinterpreted their survey results this way. They will assume that a lower score from one business unit indicates lesser performance, when in fact that score may simply reflect a regional bias compared to the locations of other business units.

Variation in Customer Interpretation and Repeatability of the Rating Scale

Imagine that your job is to measure the length of each identical widget that your company produces to make sure that the quality and consistency of your product is satisfactory. But instead of having a single calibrated ruler with which to make all measurements, you must make each measurement with a different ruler. This is not a problem if all the rulers are identical, but you notice that each ruler has its own calibration. What measures as one inch for one ruler measures 1¼ inches for another ruler, ¾ of an inch for a third ruler, etc. How well could you evaluate the consistency of the widget lengths with this measurement system if you need to determine lengths to the nearest 1/16 of an inch? Welcome to the world of customer satisfaction surveys.

Unlike the scale of a ruler or other instrument which remains constant for all measurements (assuming its calibration remains intact), the interpretation of a survey rating scale varies for each responder. In other words, the people who complete the survey have their own “calibrations” for the scale. Some people tend to be more positive in their assessments; other people are inherently more negative. On a scale of 1 to 10, the same level of satisfaction might solicit a 10 from one person but only a 7 or 8 from another.

In addition, most surveys exhibit poor repeatability. When survey recipients are given the exact same survey questions multiple times, there are often differences in their responses. Surveys rarely pass a basic gage R&R (repeatability and reproducibility) assessment. Because of these factors, surveys should be considered noisy (and biased) measurement systems – their results cannot be interpreted with the same precision and discernment as data that is produced by a physical measurement gauge.

Statistical Significance

Surveys are, by their very nature, a statistical undertaking and thus it is essential to take the statistical sampling error into account when interpreting survey data. Sample size is part of the calculation for this sampling error: if a survey result shows a 50 percent satisfaction rating, does that represent 2 positive responses out of 4 surveys or 500 positives out of 1,000 surveys? Clearly the margin of error will be different for those two cases.

There are undoubtedly thousands of examples of companies failing to take margin of error into account when interpreting survey results. A well-known financial institution routinely punished or rewarded its call center personnel based on monthly survey results – a 2 percent drop in customer satisfaction would solicit calls from executives to their managers demanding to know why the performance level of their call center was decreasing. Never mind that the results were calculated from 40 survey results with a corresponding margin of error of ±13 percent, making the 2 percent drop statistically meaningless.

Handpicked Content:   Defining CTQ Outputs: A Key Step in the Design Process

An optical company set up quarterly employee performance bonuses based on individual customer satisfaction scores. By achieving an average score between 4.5 and 4.6 (based on a 1-to-5 scale), an employee would get a minimum bonus; if they achieved an average score between 4.6 and 4.7, they would get an additional bonus; and if their average score was above 4.7, they would receive the maximum possible bonus. As it turned out, each employee’s score was calculated from an average of less than 15 surveys – the margin of error for those average scores was ±0.5. All of the employees had average scores within this margin of error and, thus, there was no distinction between any of the employees. Differences of 0.1 points were purely statistical noise with no basis in actual performance levels.

When companies fail to take margin of error into account, they wind up making decisions, rewarding or punishing people, and taking actions based purely on random chance. As statistician W. Edwards Deming shared 50 years ago, one of the fastest ways to completely discourage people and create an intolerable work environment is to evaluate people based on things that are out of their control.

Proper Use of Surveys

What can be done? Is there a way to extract useful information about surveys without misusing them? Or should customer satisfaction surveys be abandoned as a means of measuring performance?

It is better not to use surveys at all then to misuse and misinterpret them. The harm that can be done when biases and margin of error are not understood is worse than the benefit of having misleading information. If the information from surveys can be properly understood and interpreted within their limitations, however, then surveys can help guide companies in making their customers happy. The following are some ways that can be accomplished.

Determine the Drivers of Customer Satisfaction and Measure Them

Customers generally are not pleased or displeased with companies by chance – there are drivers that influence their level of satisfaction. Use surveys to determine what those key drivers are and then put performance metrics on those drivers, not on the survey results themselves. Ask customers for the reasons why they are satisfied or dissatisfied, then affinitize those responses and put them on a pareto chart. This information will be more valuable than a satisfaction score, as it will identify root causes of customer happiness or unhappiness on which measurements and metrics can then be developed.

For example, if it can be established that responsiveness is a key driver in customer satisfaction then start measuring the time between when a customer contacts the company and when the company responds. That is a hard measurement and is more reliable than a satisfaction score. The more that a company focuses on improving the metrics that are important to the customer, the more likely that company will improve real customer satisfaction (which is not always reflected in biased and small-sample survey results).

Improve Your Response Rate

If survey results should reflect the general customer population (and not a biased subset of customers) then there must be a high response rate to minimize the non-response bias. Again, the goal should be at least an 80-percent response rate. One way to achieve this is to send out fewer surveys but send them to a targeted group that has been contacted ahead of time. Incentives for completing the survey along with reminder messages can help increase the response rate significantly.

Making the surveys short, fast and painless to complete can go a long way toward improving response rates. As tempting as it may be to ask numerous and detailed questions to squeeze every ounce of information possible out of the customer, a company is likely to have survey abandonment when customers realize the survey is going to take longer than a few minutes to complete. A company is better off using a concise survey that is quick and easy for the customers to complete. Ask a few key questions and let the customers move on to whatever else they need to attend to; the company will end up with a higher response rate.

Do Not Make Comparisons When Biases Are Present

A lot of companies use customer survey results to try to score and compare their employees, business units, departments, and so on. These types of comparisons must be taken with a grain of salt, as there are too many potential biases that can produce erroneous results. Do not try to compare across geographic regions (especially across different countries for international companies), as the geographic bias may lead to the wrong conclusions. If the business is a national or international company and wishes to sample across a large customer base, use stratified random sampling so that the customers are sampled in the same geographic proportion that is representative of the general customer population.

Handpicked Content:   Achieving Customer Delight Via VOC and Data Analysis

Also, do not compare results from surveys that were administered differently (phone versus mail, email, etc.) – even if the survey questions were identical. The survey methodology can have a significant influence on the results. Be sure that the surveys are identical and are administered to customers using the exact same process.

Surveys are rarely capable of passing a basic gage R&R study. They represent a measurement system that is noisy and flawed; using survey results to make fine discernments, therefore, is usually not possible.

Always Account for Statistical Significance in Survey Results

This is the root of the majority of survey abuse – where management makes decisions based on random chance rather than on significant results. In these situations Six Sigma tools can be a significant asset as it is critical to educate management on the importance of proper statistical interpretation of survey results (as with any type of data).

Set a strict rule that no survey result can be presented without including the corresponding margin of error (i.e., the 95 percent confidence intervals). For survey results based on average scores, the margin of error will be roughly



where ? is the standard deviation of the scores and n is the sample size. (Note: For sample sizes <30, the more precise t-distribution formula should be used.) If the survey results are based on percentages rather than average scores, then the margin of error can be expressed as



where p is the resulting overall proportion (note that the Clopper-Pearson exact formula should be used if np < 5 or (1-np) < 5). Mandating that a margin of error be included with all survey results helps frame results for management, and will go a long way in getting people to understand the distinction between significant differences and random sampling variation.

Also, be sure to use proper hypothesis testing when making survey result comparisons between groups. Use the following tools as appropriate for the specific scenario:

  • For comparing average or median scores, there are t-tests, analysis of variance, or Mood’s Median tests (among others).
  • For results based on percentages or counts there are proportions tests or chi-squared analysis.

If comparing a large number of groups or looking for trends that may be occurring over time, the data should be placed on the appropriate control chart. Average scores should be displayed on an X-bar and R, or X-bar and S chart, while scores based on percentages should be shown on a P chart. For surveys with large sample sizes, an I and MR chart may be more appropriate to account for variations in the survey process that are not purely statistical (such as biases changing from sample to sample, which is common). Control charts go a long way in preventing management overreaction to differences or changes that are statistically insignificant.

Finally, make sure that if there goal or targets are being set based on customer satisfaction scores, those target levels must be statistically distinguishable based on margin of error. Otherwise, people are rewarded or punished based purely on chance. In general, it is always better to set goals based on the drivers of customer satisfaction (the hard metrics) rather than on satisfaction scores themselves. Regardless, the goals must be set as statistically significantly different from the current level of performance.


Customer satisfaction surveys are bad, evil things. OK, that’s not necessarily true, but surveys do have a number of pitfalls that can lead to bad decisions, wasted resources and unnecessary angst at a company. The key is to understand survey limitations and to not treat survey data as if it were precise numerical information coming from a sound, calibrated measurement device. The best application of customer surveys is to use them to obtain the drivers of customer happiness or unhappiness, then create the corresponding metrics and track those drivers instead of survey scores. Create simple surveys and strive for high response rates to assure that the customer population is being represented appropriately. Do not use surveys to make comparisons where potential biases may lie, and be sure to include margin of error and proper statistical tools in any analysis of results.

Used properly, customer satisfaction surveys can be valuable tools in helping companies understand their strengths and weaknesses, and in helping to identify areas of emphasis and focus in order to make customers happier. Used improperly, problems ensue. Make sure your company follows the right path.

Comments 30

  1. Mike K.

    Just from experience – I was recently victim of a survey which had the gall to not only expose me to over 50 questions: One of them was a multiple choice question with over 1000(!) entries, sorted neither thematically nor alphabetically!

    And then, the multiple choice answers (which I picked more or less random) were used to create a comparison matrix, resulting in a 20×5 matrix where the surveyors seriously expected me to fill a Likert Value (1-5) in each of these 100 records.

    While I have no idea who on earth actually takes the time to complete this survey, I would be very, very, very interested how they attribute statistical significance to the results they obtain.

    I agree with your article that the described steps MUST be taken into account when evaluating surveys – but I dare go even one step further: Survey Design itself must plan for these factors prior to bombarding the customer with the survey.

    The best surveys are minimally intrusive on the customer while providing maximum levels of statistical significance – the worst surveys are highly intrusive with no significance at all.


  2. gary

    Simply brilliant. Thank you for the contribution.

  3. Vik SIdhu

    Great primer on how to and not to conduct a survery.

  4. Robert Ballard

    Excellent article. Whenever I am on a call with customer service I’m often asked if I would like to take a follow-up survey., typically prior to the actual communication with the agent. I always indicate that I will take the survey thinking I may get better service with the thought that it’s possible that the agent knows that I’ll be participating in the survey. This is another example of bias as presented in the article. Great job.

  5. Chris

    Thanks for the article Rob. There are many in the healthcare industry that could benefit from your suggestion of analyzing and improving the driving factors of patient satisfaction, rather than spending so much energy on the results of the surveys themselves.

  6. sarah

    Your article has made me realize that I have not considered some key issues when pushing for customer surveys. Many thanks for sharing your knowledge.

  7. SEC

    Can non response bias be estimated?

  8. Kaizen TQM

    Yes you are absolutely right.I am Completely agree with you.The Customer needs a very prominent service to get satisfied.if we provide the Proper attention the customer get satisfied.for that we need to make some changes or we need to apply some new Techniques such as Lean, Kaizen, Kaizen TQM etc

  9. Mary

    Survey HA HA HA ! It is ridiculous Customer Services is abuse from “customer”. It is very sad how employees worked hard under pressure in the pay is only penny’s. RATE A CUSTOMER ABUSE IN HORRIBLE POOR COMUNICATION!

  10. Mike Morges

    Hi Rob, very interesting article, thank you. I have 3 questions:

    1. How would you measure the margin of error of a customer satisfaction metric that uses a mean score (e.g. 7.6, scale: 1 to 10). Do you apply the same formula?

    2. Setting targets is always tricky. How would you set a target for a customer satisfaction metric? I assume you can’t just multiple your baseline by +3% or 10% (not very scientific…)? Instead, and from your article, would you set a target that is ‘significant’ to reach?

    3. And finally how do you know when your metric is beating your ‘targets’ with a confidence level of 95%? (do you have to factor in 2 margin of errors, one for the target and one for actuals?)

    Thank you

  11. Martha

    My company sends a satisfaction survey to customers who have used our services. Out of approximately 300 customers who I helped, 15 responded to the survey. With a sample of this size, would the outcome be statistically significant?

    To take it a step further, the survey uses a top-box approach, with the goal of 89.5 of respondents rating their overall experience as a 5. So if 2 respondents out of 15 do not award a 5, the performance goal isn’t met. Any thoughts on the merit of a score given by 2 respondents out of a population of 300?

  12. Norbert Feher


    Thank You.

    Some very interesting points and I also recommend the readers to have a look at the 2nd and 3rd chapter in Darrel Huff’s book How to lie with statistics.

    Thanks again!

  13. Ram

    Satisfaction data generally tends to be skewed, with most responses leaning towards the high end of the scale. We are finding this to be the case with a few satisfaction surveys (some of which are ongoing tracking surveys). Do you suggest log transformation before doing satisfaction driver analysis using regression? Multiple regression does assume normality. We have been doing it without any transformation. Please share your thoughts on this. Thanks.

  14. clyde woodard

    I have just concluded my business transaction and your help was valuable to my success.

  15. Mike

    Hey Rob – Under your section Survey Methodology Bias you state: “The manner in which a customer satisfaction survey is administered can also affect the results.”

    Can you advise where I might be able to find more information about this? I am trying to figure out if surveys given to customers by their direct points of contact within a company might be biased versus surveys sent from a neutral generic company email. Intuitively I would say yes. But I wondered if I was right and if there is any research on this. Can you shed light on this?


  16. David

    With NPS questions (how likely am I to recommend…) I always answer 0 or “not at all likely” simply because I don’t normally recommend Telcos or Skin Care products to anybody. This response almost always asks for further comment to explain my negative answer. I wonder what the +/- statistical impact is of people finding the survey tedious when all they want to know really is whether it is good or bad and why. I also wonder how many people would rate something honestly as a 10 or excellent.

  17. Mike Bruss

    I work at a company that uses a 3-6-12 month average customer service score calling 22/1300 possible candidates a month and requires that 85% of these customers are 5/5 in regards to service with a 4 or less counting as a negative score. They also call the same amount of people from each location, regardless of transaction count, meaning that some locations only do 250-300 transactions, but still have 22 samples that are used in their customer service survey. I am no statistics major but I just don’t see that as an apples to apples comparison, nor do I believe that 22/1300 is an accurate representation of the customer base especially when locations are located in several different areas in regards to affluence. Also, the majority of locations missing on the 85% mark are getting several 4/5’s as opposed to the 5/5’s required for the survey to elicit a positive score. I am curious to know whether or not this type of survey is a accurate representation of customer service or if this is an inaccurate system. Let me know if you need any more information.

  18. Celeste

    I sent this article to my store manager who is in the process of threatening every shift to write everyone up based on the “no’s” we get from random surveys. I work at CVS, and have seen the disparaging behavior in which the company takes too much emphasis on its surveys and has decided to punish its employees and base their performance on them. We have been accused of not doing our jobs and have been threatened down-right by management if we get a no, we will get written up and essentially fired. It’s amusing because after I sent some positive feedback to the rest of the store to break the negative trend our management is trying to set, our district manager suddenly appears inside the store. I, and other employees feel like a threat ourselves to management, that anything they don’t find acceptable they will punish us for. I feel like we are becoming divided, and are being told to discard our moral compass in favor of following tyrannical demands and threats. The workplace has become a stressful and non-conducive environment, it’s terrible and disappointing to see how some people can be to their employees… I only hope businesses who really do care won’t repeat these mistakes. Keep power hungry people from power, thats the best filter we can have at the moment.

  19. Peter OBrien

    Just curious, when did you write this article? Very nicely done, by the way.


  20. cyp

    You forgot to mention most of these surveys are used to find and fire people that are dissenters. Usually dissenters (unhappy employees) are silenced by fear of unemployment. So the exceptions can be fired easily enough. The surveys are never used to change anything in the company.

Leave a Reply