# Regression as a screening tool

Six Sigma – iSixSigma Forums Old Forums General Regression as a screening tool

Viewing 6 posts - 1 through 6 (of 6 total)
• Author
Posts
• #52406

Lorax
Participant

Folks,
I’ve got a bunch (10) of “potential” Xs (they are potential because their variation has not yet been proven to influence the Y).
In order to identify which ones are the “critical few” I’m regressing each one in turn against the Y. This gives me a bunch of things. Firstly is tells me whether that particular Potential X is significant or not, it also lets me know the amount of variation in the Y which is attributable to that Potential X.
My question is around the criteria for a regression model. Is it critical at this stage that my DF are greater than 4?
How critical is it that the data follows a normal distribution? Can I get away with “nearly normal”?
Comments on the approach is also very welcome.
Lorax

0
#184390

Remi
Participant

Hai Lorax,
your method sounds sound but it isn’t. You have 2 risks that can garbage your conclusions.
Risk 1: the X’s are not independent.
Risk2: the X’s have interactions (Y feels X1 an X2 together different than seperately because they strengthen their influence on Y)
The recommended method.
1: Check if the X’s are independent;2: Make a regression model of Y as function of all X’s together (also interactions)
How to:1: Matrixplot of X’s (only 1-on-1 independency or use Principal Components Analysis (BB level knowledge)2: (in MINITAB) use GLM or analyse as a linear DoE; p-values tell you which X’s or interactions to remove; R-sq how much you still are missing.

Good luck,
Remi

0
#184393

Lorax
Participant

Thanks Remi, some good thoughts.
You are right about the possible existence of interactions. I dont know if a matrix plot is going to help get round this though.
What needs to be done is the reduction of a big list of potential Xs down to a smaller list which includes only the “juiciest” ones (those which have the biggest impact and are conceivably possible to control or manage).
Perhaps I should hold off analyzing anything until I get all the data and then do some sort of analysis of them all against the Y at that point (your GLM idea or a multiple regression).
Thanks again
Lorax

0
#184395

Cone
Participant

You are on the right track. The cautions by Remi are technically
correct but overly cautious at this point. Instead of just regressing all x’s against the Y, do a matrix plot of
all variables. You will see strong relations with the output if they
exist. You will also see inputs that are playing together (lack
independece) Anything that looks like a relationship demands the scrutiny
demanded by Remi. Everything else can be dropped (screened
which is your objective). Make sure your measurements of both the inputs and outputs are

0
#184524

ROSS
Member

Hi:
I have found that using Stepwise Multiple Regression with all the X’s in the Model along with Two Factor Interactions of those X’s is a useful tool to begin the screen of multiple X’s (some of which may be co-linear). Stepwise Regression is a natural way to identify & remove co-linear X’s from the model. Hope this helps. Have a great day, Tony

0
#184530

Lorax
Participant

Thanks folks,
I’m leaning more and more toward a GLM to give me the answer as to which Potential Xs are most influential.
Lorax

0
Viewing 6 posts - 1 through 6 (of 6 total)

The forum ‘General’ is closed to new topics and replies.