Multiple regression Analysis
Six Sigma – iSixSigma › Forums › Old Forums › General › Multiple regression Analysis
- This topic has 5 replies, 6 voices, and was last updated 12 years, 6 months ago by
Imawizard.
-
AuthorPosts
-
January 23, 2010 at 10:27 am #53166
Madhukar SinhaParticipant@Madhukar-SinhaInclude @Madhukar-Sinha in your post and this person will
be notified via email.Dear All,
Pl. help me if possible. In a multiple regression with 6 independent input parameter for one output parameter, I get R sq as 72.8% , 3 parameters have p value < 0.05. But when I conduct multiple regression with these 3 KPIs, R Sq. value falls drastically to 25.3 and also the P value of two of these comes more than 0.05. What could be the possible reasons?
Regards/ Madhukar0January 23, 2010 at 12:23 pm #188611Use the best subsets function in Minitab and you’ll see what is
happening0January 23, 2010 at 1:48 pm #188612
Eric MaassParticipant@poetengineerInclude @poetengineer in your post and this person will
be notified via email.Madhukar -I don’t know if this is the problem in your specific
case, but it is often difficult to get consistent
results with Multiple Linear Regression when the
“independent variables” are not independent – that
is, when they are correlated. Can you get the
matrix of correlation coefficients among the 6
“independent” input parameters? Best regards,
Eric Maass0January 23, 2010 at 3:33 pm #188614
TierradentroParticipant@johnInclude @john in your post and this person will
be notified via email.Simply plotting the response against the predictors and the predictors against each other often provides clarity. Good luck.
0January 25, 2010 at 12:53 pm #188646
Robert ButlerParticipant@rbutlerInclude @rbutler in your post and this person will
be notified via email.Everything that Stan, Eric, and John have said is correct and they should be taken into account when addressing your problem.
Specifically what is going on is that you are looking at correlations between your X’s and your Y. When you have multiple X’s you are looking at the correlation between a particular X and the specific Y while controlling for the effects of the other X’s. What this means is the significance of the X’s can and will change as X’s are entered or removed from the model.
Stan’s recommendation of best subsets in one approach to identify final significant X values. Another possibility is running backward elimination and stepwise regression on the full data set to see if they converge to the same model. The fact that your “full” model didn’t “blow up” – that is the regression package “allowed” all 6 X’s into an initial model – indicates that the X’s are not perfectly correlated with one another but it still doesn’t tell you if their partial confounding is such that all cannot be included in the model building effort.
If you have Minitab it is my understanding you have the ability to look at the VIF’s for each of the 6 X’s. This is preferable to a simple correlation matrix but if you don’t have this capability then you should take Eric’s advice and look at the correlation matrix of the X’s. In either case you should plot the X’s against one another and look at the resultant plots for exactly the reasons John mentioned.
Another thing you should do is scale all of the X’s to a -1 to 1 range before running the regression. This avoids the problems that can arise in term selection when the ranges of various X’s are different from one another (for example X1 has a range of .1 to .6 and X2 has a range from 10 to 100)0January 29, 2010 at 6:47 pm #188857
ImawizardParticipant@ImawizardInclude @Imawizard in your post and this person will
be notified via email.Somrtimes when you add or remove variables in your regression analysis, the number of data points (sets) being used in the regression changes. This happens when your data set is not perfectly complete, i.e. you may have missing points fro some variables and not others. Minitab tells how many points are being used and how many have missing values in the output from the regression. Compare this info for your each regression analysis.
When the data set changes you can get very different results from your egression analysis.
Just a possibility.0 -
AuthorPosts
The forum ‘General’ is closed to new topics and replies.