# Correlation and regression

Six Sigma – iSixSigma Forums Old Forums General Correlation and regression

Viewing 7 posts - 1 through 7 (of 7 total)
• Author
Posts
• #49730

Sachin Dhavle
Member

Is it necessary that independent variable(X) should be correlated to dependent variable (Y)?
How to find correlation between (i) Continuous data and categorical data   (ii) Continuous data and Binary data   (iii) continuous data and discrete data
What is assumption for regression model? Which Variable we should include in regression model? What is the effect on regression equation if we include variable X (independent variable), which is uncorrelated to Response variable (Y).

0
#170340

Fake MBB
Participant

Yes

0
#170353

Robert Butler
Participant

With the exception of the question about an uncorrelated X, you posted this same question last week.  As I said (see the post below) for a continuous Y the method of regression is ordinary least squares and the ways of handling the various X’s in the model are as stated previously.
https://www.isixsigma.com/forum/showmessage.asp?messageID=138584
It isn’t necessary for an X to be correlated with a Y.  You could include it if you wanted to but the whole idea of building a regression equation is to construct a mathematical description of the relationships between a Y and those X’s that are statistically significantly correlated with the Y.  The only effect of including a variable that is not signficantly correlated with the Y would be a slight increase in the R2 and a slight decrease in the RMSE.
There are no assumptions concerning a regression model.  Least squares regression is nothing more than a series of specified algebraic calculations.  The assumptions occur when you want to examine the regression model and they are:
1. The error (that is the difference between the fitted line and each of the values of Y that are part of the data set under consideration) is a random variable with mean zero and a variance (unknown) which is the square of sigma.
2. The error values are uncorrelated.
3. The error is a normally distributed random variable – this last implies that the error terms are not only uncorrelated but necessarily independent.
A good discussion of the above can be found on pp. 8-33 of Applied Regression Analysis, 2nd Edition – Draper and Smith.

0
#170355

Sachin Dhavle
Member

Robert understand about regression.But I also asked for how to find correlation between continuous variable and categorical variable.

0
#170356

Robert Butler
Participant

Yes, you did and in the first post I said that for continuous Y and categorical X’s you build dummy variables for the X’s and run the regression of Y against the dummy variables.

0
#170375

Ryan
Member

It sounds like you need an introduction to multiple linear regression:
http://www2.chass.ncsu.edu/garson/pa765/regress.htm

0
#170395

Sachin Dhavle
Member

Thanks Robert and  Ryan for your help

0
Viewing 7 posts - 1 through 7 (of 7 total)

The forum ‘General’ is closed to new topics and replies.