iSixSigma

Regression Analysis – Mixed Variable Types

Six Sigma – iSixSigma Forums General Forums Tools & Templates Regression Analysis – Mixed Variable Types

Viewing 7 posts - 1 through 7 (of 7 total)
  • Author
    Posts
  • #243608

    Zeus
    Participant

    I have a response variable of % accuracy.  I have 3 predictor variables and am looking to explain which, if any, significantly effect the response variable.  If everything was continuous it would be a relatively straightforward regression analysis.  However, 2 of the 3 predictors are binary and the 3rd predictor is nominal data.  So it’s a mish-mash of regression approaches.  Is there a way to do this?  Do I need to create dummy variables and contrasts?  That is a bit beyond anything I’ve done before…

    Thank you! – Mike

    0
    #243631

    Robert Butler
    Participant

    You can run a regression with the variables mentioned.  You need to remember the notion that regression variables (either X’s or Y’s) need to be continuous is just standard issue boiler plate and is overly restrictive.  The reason it is offered in many courses is to make sure you don’t go wrong with great assurance given that all you know about regression is what you learned in some two week course.

    For the binary variables you can just code them as -1,1 and go from there.  For the nominal you will have to set up dummy variables.  What this amounts to is picking one of the nominal settings as the reference and then coding the others accordingly.  So,for example, if you had a 3 level nominal variable you would code one of them as (d1=0, d2 =0) and the other two would be coded (d1 = 1, d2 = 0) and (d1 = 0, d2 = 1) respectively and then you would run the regression using d1 and d2 in place of the nominal settings.

    If you need a more thorough discussion of the subject I would recommend getting a copy of Regression Analysis by Example by Chatterjee and Price through inter-library loan and review the chapter on this subject.

    0
    #243639

    Chuck White
    Participant

    @rbutler is correct, but a lot of modern statistical analysis software will do all of that for you. I am currently using Minitab 19, and its dialog box for regression includes inputs for both continuous and categorical variables. For the categorical variables Minitab does just what Robert described, but it’s all in the background. You can also use an ANOVA general linear model with the categorical inputs as factors and the continuous inputs as covariates — it will give you the same result. I would suggest digging into your software to see what options are available.

    0
    #243640

    Robert Butler
    Participant

    @jazzchuck -that’s very interesting.  I guess I have mixed feelings about automating the dummy variable.  Question:  Does Minitab take the time to explain to you somewhere in the output what it has done for you and why?  If it doesn’t and you try running a regression using nominal variables in a package that doesn’t automatically generate a dummy variable matrix you are really going to go wrong with great assurance.  This might also explain why I have the sense that more people are just dumping nominal variable levels in a regression package, generating stuff that doesn’t make any sense and then telling me in front of a bunch of people how statistical analysis really won’t be of much value in terms of analyzing the problems “we” deal with.  :-)

     

    0
    #243641

    Chuck White
    Participant

    @rbutler, Minitab doesn’t explicitly explain the process or refer to “dummy variables” in the output, but it shows the coefficients for k-1 levels for each categorical predictor (the first level of course is the reference, included in the regression constant). It also displays a separate regression equation for each combination of categorical predictors. For users that want to understand the nuts and bolts of the analysis, Minitab’s help includes a very thorough Methods and Formulas selection which explains what happens behind the curtain.

    I attached sample output of a regression analysis I ran on some random data (which explains why nothing is significant). The dependent variable is Y and the predictors are xc, xn, and xb (c=continuous, n=nominal with 3 levels, b=binary).

    Attachments:
    1. Minitab-Regression-Sample.pdf
      You must be signed in to download files.
    0
    #243648

    Zeus
    Participant

    Robert and Chuck, This is outstanding.  Just what I was looking for and confirms the direction I was going to take.  Thank you.

    I am still on Minitab 18 as my company hasn’t upgraded yet, but will see if it does what you mentioned above.  The additional insights and the example are also appreciated.

    I also agree on the implied misuse of tools/statistics.  I am smart enough to know what I don’t know and when I get that feeling on something will reach out to someone who does.  Thanks for being there to help.

     

    0
    #243649

    Chuck White
    Participant

    Minitab 18 has the same functionally as 19 for both regression and ANOVA — the output may look a little different, but all of the content should be there.

    To Robert’s point, just make sure you always put categorical predictors in the correct window — if they are text values, Minitab will let you select them as categorical predictors without coding them, but you can’t select them as continuous predictors. But if they are coded as numbers, it’s up to you to make sure the model is correct.

    Good luck!

    0
Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic.