# Correlation of Age and Attrition Reason

Six Sigma – iSixSigma Forums Old Forums General Correlation of Age and Attrition Reason

Viewing 18 posts - 1 through 18 (of 18 total)
• Author
Posts
• #50437

Jeff
Participant

I’m trying to get the correlation coefficient for Age of Employee and Reasons for Attrition. My data sets are the ages of employees and the reason they left. I have 108 employees and 16 reasons total. I can tell from the data that the younger my employee the more likely it is that he or she will quit. How can I calculate a correlation coefficient for this? I thought to assign a numeric value to each reason, and run the numbers, but since some of reasons have the same number of people, it seems like this would be erroneous. As usual I’m missing something simple but I can’t seem to figure it out.Jeff

0
#173379

GPaisley
Participant

Since your output (Y) metric (reason for leaving) is discrete, you cannot calculate a correlation coefficient.  You will probably need to create some buckets of ages (either by 5 or 10-year ranges) or by looking at a scatter plot of ages and making some groupings based on their natural groups.
Then, you could either do a chi-square test (although the 16 reasons and just 108 responses will probably cause you to have some cells with an expected count of less than 5), or do a series of pareto graphs of reasons per age group.  Of course, a pareto will only give a visual representation, but it may help you group the reasons into buckets to make your chi-square more powerful
Chances are, your 16 reasons can be aligned into 4 or 5 groups as well, and so you might want to do that before running Chi-square.
Best regards,
GP

0
#173424

Erik L
Participant

Jeff,
Another thought might be to regress age of the employee against time of service.  If there wasn’t anything there, the grand average of the response would be as good a predictor as the independent variable hypothesized….If there is a relationship, the Pareto/Chi-square analysis suggested could point to likely culprits behind the relationship…
Regards,
Erik

0
#173431

Participant

You are working in an areaq called multivariate statistics. This is not commonly used in quality control or 6-Sigma. I will dig out my textbooks from grad school.  I think I saw a problem like this before.
GD is right, you should try to cluster your responses. It whould make the problem easier if you don’t give up some important information by doing this.

0
#173445

Michael Schlueter
Participant

Hi Jeff,
I’m just wondering how you obtained your observation “the younger, the more likely to quit”? Did you recognize frequency, one way or the other? If so, you can try correlating (relative) frequencies with age or other factors.
What’s about employees, who stay? Do the stay for one of the 16 reasons, too? I.e. can some of these reasons be insignificant?
I suppose such a subject has already been studied elsewhere. Did you search for other results, ideas, approaches?
Assuming success, i.e. an answer as you want to obtain: what will you do with it? What can you really change? Is there a need for a change at all? I.e. what is the benefit from frequent quitting?

0
#173447

Robin Fielder
Member

Hello
I worked on an attrition project and used Binary Logistic Regression to calculate the probability of leaving based on age (and also length of service, and some other continuous data factors).
‘Leave or not leave’ is discrete Y data, and age is continuous X data, so Binary Logistic Regression is the right model to use here.
Assign the value 1 or zero to leave or stay status, then have the ages in the next column of the worksheet.  When you run the test in Minitab in the dialogue box  put ‘response’ as the stay/leave status, ‘model’ as age, and select the ‘storage’ button and tick ‘event probability’, then run the test.
The event probability is added into the Minitab worksheet in a column called ‘EPR01’ which gives the probability of the stay or leave status depending on age.
You can then scatter plot age against the error probability to get a graphical representation of the relationship.  (And of course check out the P value of the test to see if is significant anyway).
This is a very helpful guide to Binary Logistic Regression:
http://europe.isixsigma.com/library/content/c071212b.asp
Rob

0
#173449

Jeff
Participant

Thanks, Robin. I’ll take a look at BLR and see what I get.Michael,
Of our 4 age groups, the 18-25 group has an attrition rate of 72%. The next closest is age 26-32 with a rate of 23%. From a numbers perspective, the 18-25 group is costing us loads in time, training, money, etc. And in this case, our workforce is about 65% of 18-25 people. The 16 reasons were those noted on exit surveys. I think all but 5 are insignificant and I’ve since merged data into 5 groups. I am working in a 3rd party collection environment and studies are hard to find. Collectors keep their cards close to the chest and getting information from them is like pulling teeth. We see a need for change in that our training program runs 4 weeks as mandated by our client. We pay for training unless the person stays with us for 90 days. Our younger folks are leaving in a cluster around day 45; some 2 weeks after training. So we lose a boatload of cash. That is, we don’t get paid by the client for them. There is no benefit to us of frequent quitting as our pay scale never rises to a level significant enough for us to have to look at ROI for a long-tenured employee. As for what can be changed, I think our major issue the front-line managers who get the trainees directly out of class. We have some data indicating that the managers play a large role in attrition and retention, depending on management style. I have a VP who is an old Quality guy and he wants good numbers on this, but this seems to be a bit out of my range. I’m game for any suggestions or comments, because I’m swimming just above the waves here and treading water is getting me nowhere.Jeff

0
#173453

Jeff
Participant

I just ran a CHITEST in Excel and my result is 0.959487981. I used CHIINV to get a result of 6.93. I have a 6X4 table so my df=15. At p>0.05, df 15 the table value is 24.996. Since my result is less than table, I fail to reject null. Am I analyzing my results correctly? This Chi Square thing is getting to me. Obviously I’m not statistician, but by GB Handbook and what Excel help says seem to be at odds. Maybe I’m misinterpreting the CHITEST value above. Any thoughts?0.05, df 15 the table value is 24.996. Since my result is less than table, I fail to reject null. Am I analyzing my results correctly? This Chi Square thing is getting to me. Obviously I’m not statistician, but by GB Handbook and what Excel help says seem to be at odds. Maybe I’m misinterpreting the CHITEST value above. Any thoughts?0.05, df 15 the table value is 24.996. Since my result is less than table, I fail to reject null. Am I analyzing my results correctly? This Chi Square thing is getting to me. Obviously I’m not statistician, but by GB Handbook and what Excel help says seem to be at odds. Maybe I’m misinterpreting the CHITEST value above. Any thoughts?Jeff

0
#173454

Michael Schlueter
Participant

Hi Jeff,
Thank you for sharing your situation.
I wonder what kind of solution may be interesting or possible for you? I assume that stricter contracts for new young employees is not what you want, do you? Or other contacts with your clients?
I’ve once seen a study from Japan, where companies try to identify the talented and the non-talented programmers before they begin. This study needs data from both, those who stay, and those, who leave. Would such an instrument be what you want, i.e. having a reliable early-diagnosis, who will most likely stay after 90 days, and who won’t?
BTW, what happens between day 45 and day 90, and what happens within the 2 weeks after the training? Or vice versa, what is the reason to have the training so early? Why doesn’t the management style problem show up earlier? How could it be detected with the individual much earlier? Can’t you rearrange things in such a way that the problem occurs earlier, at lower cost and you still provide the result and quality in-time as your clients request?
Correlation, your initial post, and early-diagnosis can be seen as different views onto the same problem, the problem of measurement. You may want to review my reply at https://www.isixsigma.com/forum/showmessage.asp?messageID=143072, which shows generic ways to deal with measurement situations. In analogy:

How can you eliminate the need for correlation/measurement?
How can you replace correlation/measurement by (cheaper) detection (e.g. with less or even with no data)?
How can you use secondary effects, in your case of employee x management style interaction? (E.g. employees who leave on day 45 already did xxx on day 3.)
How can you detect or measure by-products of problematic employee x management interaction, which will show up only later?
Why can’t you invite a couple of employees, who left, current employees and managers to discuss these issues, say 6 to 10 people? Usually, when the setting is alright (open atmosphere, clear target and good guidance), and may be with an experienced facilitator, the observations, views and ideas of these people are exactly, what you need.

0
#173457

Jeff
Participant

Michael,
Here’s the rub: The center management feel that attrition is a function of age only. They do not feel that they are in any way responsible for attrition. Also, they feel that there is a correlation among the age of employees who leave and the reasons they leave. See my previous post on Excel CHITEST to see what my results showed regarding this. I think I found that there is no significant difference for age groups and reasons for attrition. My VP “knows” that the problem is largely due to the center management. My task has been to discount statistically the views of center management. The other problem is that we cannot be picky in those we hire. We need butts in seats and if they can speak and breathe, they’re in. We don’t know know yet what happens between days 45 and 90, but I’d like to know. The client mandates the 4 week training prior to assignment to the collection floor. We have no say in that.The focus groups I’ve done point to a strong distaste for the management style at the center. The problem is that my VP doesn’t feel that this is a justifiable argument to make changes. If, however, we can state with a high level of certainty that the problem isn’t just related to age of employees, we may be able to point them in the right direction. I hope.Jeff

0
#173458

Michael Schlueter
Participant

Hi Jeff,
Thanks for your details. I’ll try to come back to you tomorrow with fresh views.
I glanced over the chitest results. Apart from the situation you are in, with a high demand for certain, perhaps contradicting, results there always is a generic problem: do you have reliable information carriers in your data, yes or no?
If yes, then concluding “there is no relationship” is reliable.
If no, then one should try to improve the input data. Which is often difficult, as the direction for improvement isn’t always obvious. Sometimes transforming data helps (but this requires a thinking model of the situation, which reflects important parts of the “mechanics”; just going to log-scales, for example, will most likely not be enough).
Ok, may be you also will find out some new things about the period between day 45 and 90. I think creating basic insight is the task ahead, be it fact based, intuition based (gut feeling) or similar. Then we can make the numbers talk, in a positive meaning (no manipulation).
If you can’t be picky about new employees, what else may they need to know about the “horrors” ahead, which makes it more acceptable, perhaps even desirable, for them to stay? Can you retrieve anything on this from your focus-group results, experience, intuition?
Just a mental image, an analogy: sounds like you hire red-cross people, who get upset when being at an accident scene for the first time. All that blood and pain! So they come unprepared, are shocked and leave. What kind of preparation would have been necessary?
E.g. can you do something like mentorship, where your survivors of the 90 days mentor the newbies (part time or so)? This may be a change (for result) without a change (of current opinions and behaviors). Can you do a small-scale pilot run on this?
Michael

0
#173462

Vallee
Participant

Jeff,I believe you have asked many HR related questions in the past but I believe you have stayed too focused on those that left. Step back a little and look at your value stream… attrition is just a lagging indicator for the people who decided to leave. Look at those working for you today:Some Internal Stuff:1. Were they competent when they started?
2. Where they trained pre, post, and with refresher training?
3. Do they have the skills and task ability to do the job? what is expected from them day one?
4. Does training match the current required tasks or is it outdated?
5. Can (skill wise) management do the job of their employees? If not management could be requesting employees to work in “gray” areas.External Stuff:1. What is the turnover rate in the industry in your area? How do you rate?
2. What is the turnover rate in the industry in areas outside your demographics? How do you rate?Point is you only have the data for the sample that is leaving. You should also segregate and analyze those fired from the sample that left. Yes management has an agenda and his looking through a different looking glass for “expected” output. Find out why people are staying and who may be on the verge of leaving. Help this helps. HF Chris Vallee

0
#173472

Jeff
Participant

Thanks Chris & Others,I took the advice I’ve gotten and ran a simple Chi Square analysis for 4 age groups who Left and Stayed. My result was 3.51. At p=0.05, df 3, age and attrition are independent. Though we still have a higher percentage of people leaving in the 18-25 group, overall it seems, age is not the primary driver for attrition.For Internal Stuff (1) Maybe (2) Only pre…little thereafter (3) Some do (4) the material matches, the methodology may not (5) Managers correct rather than act proactively to prevent errors.For External Stuff (1) Industry turnover in the area is around 55% and ours is over 100% annualized. Industry average is about 60%.My data is for voluntary attrition only. Terminations are less than 10% of attrition, and most of those are what we call Technical Terminations; that is, it is mostly people who didn’t return from Leaves of Absence, pregnancy, etc. There is very little firing for cause believe it or not.I think discovering why people stay might be a good step forward. Training and Management appear to be primary drivers, though we have no hard numbers attesting to that.Thanks,Jeff

0
#173491

anon
Participant

Can you do a correlation between team leader/ manager and attrition on those that leave quickly.
I’ve worked in a call centre collection environment for a while and this was n eye opener for our management when they saw it. From what your saying it may be similar here.

0
#173507

Jeff
Participant

Anon,We have looked at the attrition/manager correlation and the numbers are spread pretty evenly across all the managers. Some of the feedback points to the center manager as being a primary factor. Her management style leaves no room for error. Her subordinate managers follow her lead, which might be why the attrition is spread so evenly among the managers.

0
#173519

howe
Participant

I am writing to let you know that the previous posts were not correct. Your data is right censored. Censored because some of the employees quit after a certain time. But, some are still employed. This is a classic “survival analysis”. The analysis methods by the previous posts neglect much of the data. The use of survival analysis is common in Design for Six Sigma. Design for Reliability (DFR) makes use of survival analysis. The most useful distribution used for analyzing these problems is the Weibull distribution. There are several software tools including Minitab and another called Weibull++ that facillitate the analysis of censored data. So, in summary when you are analyzing quiting (failure) as a function of time in concert with other factors, then you need to treat the problem as a survival problem.  The other techniques mentioned are not accurate for censored dataset such as the one that you are working with.

0
#173525

Jeff
Participant

Thanks Mike. I’ve never done a Weibull analysis before. I’m hoping you can help me set it up. I have a table listing age groups in the rows and reasons for attrition in the columns. I’d also like to compare the people who left with the people who stayed. So how do I set this up? I have a spreadsheet with all the data, but I don’t know how to set it up to do the calculation. I’m using the Weibull function in Excel if that helps. Anything that would give me the best data would be greatly appreciated.Jeff

0
#173556

howe
Participant

My purpose in the post was to let you know that you were being misled by the others. The answer that you would get using the other suggestions would be wrong. As to your question re Weibull and Excel, there is no way to perform the survival analysis using Excel without a great deal of development work using matrix functions, etc. If you plan to pursue the analysis, you will need software designed to do regression on survival data. That analysis can be done using Minitab 14. If you have that look at the example in the help screens, it should be sufficiently instructive. You need to have a background in multivariate regression techniques. If you do not have the software or the background, then you are at a dead end until you gain them. Unfortunately, that is the best that I can do for you.

0
Viewing 18 posts - 1 through 18 (of 18 total)

The forum ‘General’ is closed to new topics and replies.