Does regression analysis require normal data?
Six Sigma – iSixSigma › Forums › Old Forums › General › Does regression analysis require normal data?
 This topic has 5 replies, 4 voices, and was last updated 18 years, 6 months ago by Bart.

AuthorPosts

May 18, 2004 at 2:07 pm #35571
Hi,
Do we need to have normal data for carrying out Regression analysis.
Till date i thought we don’t ; but the other day one of my colleagues came up with this thought that normal data is a mandate for Regression analysis.
Pls help in throwing some light on the same.0May 18, 2004 at 2:30 pm #100418Reva,
I am by no means a regression expert, but through all of my training as a BB and MBB and in my reference books I can find no mention that normal data is a requirement for regression analysis.
I live by the following rules when it comes to regression:
1. the regression model is based on the data set utilized. be careful when using the model to predict outside of the region it models.
2. measurement systems have to be capable. it wreaks havoc on the model when you are dealing with a large amount of measurement error.
3. historical data used to generate a model may not predict future performance.
4. regression should not be used to eliminate potential variables from a DOE – BUT it can be used to add variables. an independent variable may not look important in a regression analysis, but if included in a DOE where we choose levels outside of our normal operating range – it may be important.
Not sure if this helps you or not. My only word of caution would be to scrutinize the conclusions of the analysis – normal data may not be required, but whether or not your data is normal should be considered when drawing your conclusions.
Doug0May 18, 2004 at 2:38 pm #100421Thanks for confirming Doug.
0May 18, 2004 at 2:39 pm #100422No problem Reva…although I think that confirming is a strong word to use. As I stated, I am no regression expert. I will be interested to see if Stan or Darth weigh in on your question…
0May 18, 2004 at 2:51 pm #100424
Ken FeldmanParticipant@Darth Include @Darth in your post and this person will
be notified via email.Hey, watch those comments about my weight. I just got back from South Beach so I can try the diet now. In fact, I grew up on Miami Beach and never realized SoBe would be famous for the home of Porky’s movies, Miami Vice and a diet.
With regards to the comments about normality, it is assumed in regression that the residuals (predicted minus observed values) are distributed normally. Again, even though most tests (specifically the Ftest) are quite robust with regard to violations of this assumption, it is always a good idea, before drawing final conclusions, to review the distributions of the major variables of interest. You can produce histograms for the residuals as well as normal probability plots, in order to inspect the distribution of the residual values.0May 18, 2004 at 2:56 pm #100425The simple linear regression and related methods that most people are taught in school or BB training DOES have the assumption that the model’s RESIDUALS (the differences between the observed values and the predicted values) are normally distributed with a common variance. The raw data themself will not necessarily fall under a single distribution. That is why you should test the residuals for normality.
Techniques are available that can fit models using other distributions. When using regression with life data, for example, it is common to use distributions such as Weibull, Lognormal, and others.0 
AuthorPosts
The forum ‘General’ is closed to new topics and replies.