Font Size
Topic Binary Correlation

Binary Correlation

Home Forums General Forums Methodology Binary Correlation

This topic contains 7 replies, has 3 voices, and was last updated by  Kristen Hill 2 months, 3 weeks ago.

Viewing 8 posts - 1 through 8 (of 8 total)
  • Author
  • #706571 Reply


    I have data around attendance at classes and I need to find out if there is a relationship between attendance at the most recent class and previous classes. I have coded 1 to mean attended and 0 did not attend.

    Name, Class 1, Class 2, Class 3
    A. 1, 1, 1
    B. 0, 1, 0

    I have tried correlating class 3 with classes 1 and 2 and also adding classes 1 and 2 together to see the number of previous classes attended and correlating this with class 3.

    All the correlations are low and I’m wondering if correlation is the wrong method as I’m working with mainly binary data. What do you think?

    #706572 Reply

    Based on your description of the problem I think your best bet would be to run a logistic regression with attendance at class 3 being the Y variable and attendance at classes 1 and 2 being the two X variables.

    The output would be in the form of odds ratios – that is you would have an odds ratio for attendance at class 1 correlating with attendance at class 3 and similarly for class 2. One problem you are going to have with this approach is the question of independence of attendance at class 1 relative to attendance at class 2. If the two are not independent enough you will get huge odds ratios with very large Wald confidence limits and the analysis won’t be worth much.

    One thing you need to remember with binary data – to see significance you need a lot more data than you would need for an analysis of continuous data so even if the measures exhibit sufficient independence you may not have enough data to detect a significant difference.

    #706573 Reply


    Thank you for replying so quickly. I have data for around 100 students which should help.

    I will try a logistic regression. Do you think doing a correlation would be meaningless?

    #706574 Reply

    I’m not really sure what you mean by correlation. Since you are on-line – post the data and I’ll take a look at it – just 3 columns, one for each class yes/no (1/0) and an indication as to which class is which.

    #706577 Reply


    Sorry I only just saw this message.

    The data is how you described it. Columns for each class containing 0s and 1s and I’m wondering if correlating class 3 with class 2 would give any meaningful information about whether there’s a relationship between attendance at points 2 and 3. (Standard Pearson/Spearman)

    #706578 Reply

    Not really. One of the key assumptions of the Pearson Correlation Coefficient is normality (or approximate normality) of the variables – binary data is not normal.

    If it was just a case of looking at class attendance in 2 vs class attendance in 3 you could run a 2×2 chi-square analysis which would give you a measure of association. However, given your problem description I still think logistic regression would be the better bet. It would tell you the odds of attending class 3 given attendance (or lack thereof) in class 2 and it would also tell you whether or not the odds ratio was significant.

    #706600 Reply


    OK that makes sense, thank you

    #706603 Reply

    You can run a chi-square very quickly to find a relationship, but a Logistic regression is correct as well. The chi-square can give you a very quick and easy view and can be done in Excel.

Viewing 8 posts - 1 through 8 (of 8 total)

Register Now

  • Stop this in-your-face notice
  • Reserve your username
  • Follow people you like, learn from
  • Extend your profile
  • Gain reputation for your contributions
  • No annoying captchas across site
And much more! C'mon, register now.

Reply To: Binary Correlation
Your information:

5S and Lean eBooks

Six Sigma Online Certification: White, Yellow, Green and Black Belt

Six Sigma Statistical and Graphical Analysis with SigmaXL
Six Sigma Online Certification: White, Yellow, Green and Black Belt
Lean and Six Sigma Project Examples
GAGEpack for Quality Assurance
Find the Perfect Six Sigma Job

Login Form