Home › Forums › General Forums › Tools & Templates › Analysis of Ordinal Data

This topic contains 7 replies, has 3 voices, and was last updated by Robert Butler 7 years, 1 month ago.

Viewing 8 posts - 1 through 8 (of 8 total)

- AuthorPosts
I’d like your thoughts on tools to use on analysing some ordinal data I have as I do not have a lot of experience in working with this type of data.

The scenario.

Customers are asked to complete a survey, we ask them for overall feedback and then on individual aspects of the product and service they received. The scores are between 1 and 10.What I want to know is which of the aspects influence the overall score, and which have the greatest impact.

Although this is ordinal data I’ve been testing as continuous and looking at correlation and regression analysis to get this information. Is this a valid way of doing this?

If it is the P-Values on all the correlations are 0.00 and with 65% to 85% correlation. Is there a way of seeing if there is a statistical difference between the correlations?

I also looked to do a multiple regression of all the values and then pairing back to find the significant ones. Is this a valid method to use?

Well, it is too late for this advise given that your data is already collected, but maybe next time…

Take each aspect that you’ve asked for performance responses and restate them to get importance feedback. For example, if you used a 1-10 scale from strongly disagree to strongly agree, then you would restate the aspect to coincide with a 1-10 scale for extremely unimportant to extremely important.

Then, you can flip the scores (i.e. 10 becomes 1, 9 becomes 2, and so forth) of the performance questions and then multiply them by the importance (1-100 scale). This method provides insight into the areas where your performance is lacking AND it is important to the customer. Being good at some aspect of service/product features is irrelevant if it is not important to your customers.

These 1-100 scores can then be used in a regression analysis to see what may be driving the overall satisfaction/recommendation scores.

We have been through the process of asking the customers what’s important to them before.

This is an ongoing survey about a speciic transaction with us. This means that some will get the same questions several times a year so asking this every time is not the best use of their time and our return rate will probably go down.

One of the elements of the analysis I want to do is to be able to see if what the customer says drives their behaviour is actually what drives their behaviour.

My initial analysis showed that what they said and how they scored us didn’t not match which is why I wanted to get clarification on my method to make sure the issues wasn’t there.

Also at risk of diverging from my first question wouldn’t doing Kano type questions and analysis rather than just a single importance give you a better understanding of their needs e.g. which do they need and which are nice to haves?

Admittedly my understanding of the Kano model is basic, but I think that surveys only measure the satisfiers well (where there is an expected “linear” response between performance on the attribute and customer satisfaction). You have to use other techniques to get more info about the others. Not that you couldn’t or shouldn’t try to understand dissatisfiers and delighters, but just that you’d have to employ other techniques than a satisfaction survey.

It is also true that time changes each of these categories. What may have been a delighter last year may only be a satisfier this year. Similarly, what was a satisfier (i.e. linked directly to satisfaction) may become a dissatisfier (or expectation) over time.

If there is sufficient evidence that an aspect of your service that used to drive satisfaction no longer seems to, you may need to re-investigate what is driving satisfaction.

It may also be that there is a “lurking variable” present.

My mentors have also shared that in their experience, even with the best of efforts this can be a very fuzzy target with poor signal to noise and can lead to endless cycles…

Thank you for your feedback on the first questions although answers to the actual question would have been good as well.

Is there anyone here who can give me answer to those ones.

The arena of analysis of ordinal data using parametric methods has proponents and arguements for both sides with respect to methods of analysis. On one extreme you have those who absolutely refuse to consider using any kind of parametric methods to analyze ordinal data and on the other you have those who don’t have any qualms about the issue.

Based on what I’ve read and what I’ve done it has been my experience that if you have Likert type scales with at least 4 categories and if those categories could be viewed as a linear progression then what you get using parametric methods doesn’t differ too much from what you would get using non-parametric methods.

One paper I read offered the restriction that, instead of accepting P < .05 as significant with parametric methods, you should use P < .001 instead. The paper did not explain the reasoning as a formal proof, however, I have run a number of analyses with Likert type data and run parametric and the equivalent non-parametric tests on it and I've had near perfect agreement with this recommendation (that is, a P < .001 with parametric methods almost always gave a P < .05 with the non-parametric method).

As far as using responses to questions about individual aspects of the product and service in a multivariate analysis to predict the overall rating the big issue is going to be that of co-linearity of the answers to individual aspects with one another. If you have something like Minitab you will only be able to look at these things using VIF, as opposed to running an eigenvalue analysis and looking at condition indices. You will have to drop those responses with VIF’s > 10 and then run the regression using whatever is left.

You would want to run both stepwise and backward elimination and see if the two approaches reduce to the same model. Once you have the model you will want to run the usual assessment of residuals (graphs, LOF, etc.) to assess model adequacy. As noted above – you should try running with a cut point of .001 and then re-run with a cut point of .05 to see if there are any big differences in the final reduced model terms.

I don’t quite understand the last part of your initial post “

*If it is the P-Values on all the correlations are 0.00 and with 65% to 85% correlation. Is there a way of seeing if there is a statistical difference between the correlations?”*If you could expand on this a bit more perhaps I or someone else could offer some additional thoughts.

Thank you for your answers from what you’ve said I did run the correct procedures and I’ll re-check with the lower p-value. Nice to know I was going in the right direction.

On the last part of the question.

In running regressions between the over all score and the individual parts one at a time I got a different set of R-Sq values, ranging from 0.65 to 0.85. What I was wondering is could I use these to order the differing sections or is there a “confidence level” type value for these so that say a 0.65 may be no different to a 0.69 if more data was taken?

Does this make this more clear.

If you are running a series of one-at-a-time regressions and you want to compare them then what you should do is, for each regression, examine the residuals graphically as well as with formal statistical tests to see if there are any big differences in these patterns as you fit first one X and then another. If there are significant departures from the acceptable patterns for some X’s and not others you could use the differences (good pattern vs. bad pattern) to identify a subgroup of X’s that give the best one-variable-at-a-time fit. Once you have those then you would want to compare the RMSE of each of the equations. Those X’s with significantly worse RMSE could be rejected on this basis. For the remainder you would have to look at things like lack-of-fit to further reduce the list. At some point you would most likely have a small group of X’s all which pass the above tests. At that point, which one you choose as a single predictor would become a matter of personal choice or perhaps measurement convenience.

- AuthorPosts

Viewing 8 posts - 1 through 8 (of 8 total)