Multiple regression Analysis

Six Sigma – iSixSigma Forums Old Forums General Multiple regression Analysis

Viewing 6 posts - 1 through 6 (of 6 total)
  • Author
  • #53166

    Madhukar Sinha

    Dear All,
    Pl. help me if possible. In a multiple regression with 6 independent input parameter for one output parameter, I get R sq as 72.8% , 3 parameters have p value < 0.05. But when I conduct multiple regression with these 3 KPIs, R Sq. value falls drastically to 25.3 and also the P value of two of these comes more than 0.05. What could be the possible reasons?
    Regards/ Madhukar



    Use the best subsets function in Minitab and you’ll see what is


    Eric Maass

    Madhukar -I don’t know if this is the problem in your specific
    case, but it is often difficult to get consistent
    results with Multiple Linear Regression when the
    “independent variables” are not independent – that
    is, when they are correlated. Can you get the
    matrix of correlation coefficients among the 6
    “independent” input parameters? Best regards,
    Eric Maass



    Simply plotting the response against the predictors and the predictors against each other often provides clarity.  Good luck.


    Robert Butler

    Everything that Stan, Eric, and John have said is correct and they should be taken into account when addressing your problem. 
       Specifically what is going on is that you are looking at correlations between your X’s and your Y.  When  you have multiple X’s you are looking at the correlation between a particular X and the specific Y while controlling for the effects of the other X’s.  What this means is the significance of the X’s can and will change as X’s are entered or removed from the model.
      Stan’s recommendation of best subsets in one approach to identify final significant X values.  Another possibility is running backward elimination and stepwise regression on the full data set to see if they converge to the same model.  The fact that your “full” model didn’t “blow up” – that is the regression package “allowed” all 6 X’s into an initial model – indicates that the X’s are not perfectly correlated with one another but it still doesn’t tell you if their partial confounding is such that all cannot be included in the model building effort. 
      If you have Minitab it is my understanding you have the ability to look at the VIF’s for each of the 6 X’s.  This is preferable to a simple correlation matrix but if you don’t have this capability then you should take Eric’s advice and look at the correlation matrix of the X’s.  In either case you should plot the X’s against one another and look at the resultant plots for exactly the reasons John mentioned.
      Another thing you should do is scale all of the X’s to a -1 to 1 range before running the regression.  This avoids the problems that can arise in term selection when the ranges of various X’s are different from one another (for example X1 has a range of .1 to .6 and X2 has a range from 10 to 100)



    Somrtimes when you add or remove variables in your regression analysis, the number of data points (sets) being used in the regression changes.  This happens when your data set is not perfectly complete, i.e. you may have missing points fro some variables and not others.  Minitab tells how many points are being used and how many have missing values  in the output from the regression. Compare this info for your each regression analysis.
    When the data set changes you can get very different results from your egression analysis.
    Just a possibility.

Viewing 6 posts - 1 through 6 (of 6 total)

The forum ‘General’ is closed to new topics and replies.