Correlation and regression
Six Sigma – iSixSigma › Forums › Old Forums › General › Correlation and regression
 This topic has 6 replies, 4 voices, and was last updated 14 years, 8 months ago by Sachin Dhavle.

AuthorPosts

March 31, 2008 at 4:58 am #49730
Sachin DhavleMember@SachinDhavle Include @SachinDhavle in your post and this person will
be notified via email.
Is it necessary that independent variable(X) should be correlated to dependent variable (Y)?
How to find correlation between (i) Continuous data and categorical data (ii) Continuous data and Binary data (iii) continuous data and discrete data
What is assumption for regression model? Which Variable we should include in regression model? What is the effect on regression equation if we include variable X (independent variable), which is uncorrelated to Response variable (Y).
0March 31, 2008 at 10:03 am #170340
Fake MBBParticipant@FakeMBB Include @FakeMBB in your post and this person will
be notified via email.Yes
0March 31, 2008 at 12:42 pm #170353
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.With the exception of the question about an uncorrelated X, you posted this same question last week. As I said (see the post below) for a continuous Y the method of regression is ordinary least squares and the ways of handling the various X’s in the model are as stated previously.
https://www.isixsigma.com/forum/showmessage.asp?messageID=138584
It isn’t necessary for an X to be correlated with a Y. You could include it if you wanted to but the whole idea of building a regression equation is to construct a mathematical description of the relationships between a Y and those X’s that are statistically significantly correlated with the Y. The only effect of including a variable that is not signficantly correlated with the Y would be a slight increase in the R2 and a slight decrease in the RMSE.
There are no assumptions concerning a regression model. Least squares regression is nothing more than a series of specified algebraic calculations. The assumptions occur when you want to examine the regression model and they are:
1. The error (that is the difference between the fitted line and each of the values of Y that are part of the data set under consideration) is a random variable with mean zero and a variance (unknown) which is the square of sigma.
2. The error values are uncorrelated.
3. The error is a normally distributed random variable – this last implies that the error terms are not only uncorrelated but necessarily independent.
A good discussion of the above can be found on pp. 833 of Applied Regression Analysis, 2nd Edition – Draper and Smith.0March 31, 2008 at 1:14 pm #170355
Sachin DhavleMember@SachinDhavle Include @SachinDhavle in your post and this person will
be notified via email.Robert understand about regression.But I also asked for how to find correlation between continuous variable and categorical variable.
0March 31, 2008 at 1:31 pm #170356
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.Yes, you did and in the first post I said that for continuous Y and categorical X’s you build dummy variables for the X’s and run the regression of Y against the dummy variables.
0March 31, 2008 at 7:26 pm #170375It sounds like you need an introduction to multiple linear regression:
http://www2.chass.ncsu.edu/garson/pa765/regress.htm0April 1, 2008 at 4:34 am #170395
Sachin DhavleMember@SachinDhavle Include @SachinDhavle in your post and this person will
be notified via email.Thanks Robert and Ryan for your help
0 
AuthorPosts
The forum ‘General’ is closed to new topics and replies.