# What is the right technique

Six Sigma – iSixSigma Forums Old Forums General What is the right technique

Viewing 5 posts - 1 through 5 (of 5 total)
• Author
Posts
• #39212

Kavita Sahu
Participant

If we have a mixture of Continuous and Discrete Independent variables (Xs), then which technique/methodology should be used to find out the Dependent variable (Y) ?

0
#118830

indresh
Participant

can you provide in detail what exactly are you trying to achieve
mail at [email protected]
rgds

0
#118831

Robert Butler
Participant

It depends on what you are trying to do.  If you wish to express the relationship as a regression equation then you code the discrete variables in the usual manner, normalize the continuous, run regression diagnostics to make sure your X matrix isn’t suffering from colinearity, and then run stepwise regression (both backward elimination and forward selection with replacement) to identify those X’s exhibiting significant correlation with the Y.  Once you have the “final” model you run the usual diagnostics on the residuals to test model adequacy.

0
#118833

Whitehurst
Participant

First of all sepearte the data X’s as continous / discrete If your Y is continous and normally distributed and X is also continous you can use Linear regression if X is discrete then you can use equality of variance , ANOVA , T test , Z test  if your Y discrete and x discrete you can use a chi ^2 test and if x is continous you can use a binary logistic regression .If your Y is not normal you can use non parametric test one saple wilcoxon,one sample sign, Mann whitney , Kruskal Wallis , Moods medial test which are equivalent to the tests mentioned above.
Hope this helps !! any help please feel free to mail me at [email protected]

0
#118835

Robert Butler
Participant

The post concerning the various methods of handling the X’s and the Y’s for discrete and continuous is correct in that it is the “boiler plate” that, of necessity, must be taught if you are going to try to give people useful statistical tools in the two weeks of alloted BB training time. Unfortunately, it is an oversimplification that if left unquestioned can result in a belief in barriers and limitations where none exist.
Some of the oversimplifications are as follows:
1. Y does not have to be normal, neither do the X’s. The only thing that needs to be normal (and this only if you intend to use the usual tests for significance) are the residuals – see Draper and Smith Applied Regression Analysis 2nd Edition pp 22-23.
2. Y does not have to be continuous. In the extreme case when Y is binary one uses logistic regression addresses this issue
3. When X is discrete or categorical the usual procedure is to generate dummy variables for the X and run the regression against those.
For example if X a category of 1,2,3 then the code is
If X = 1 then xdum1=1 and xdum2 = 0
if X = 2 then xdum1 = 0 and xdum2 = 1
if X = 3 then xdum1 = 0 and xdum2 = 0
The regression model is built with the terms xdum1 and xdum2 (Draper and Smith pp.241 “The use of dummy variables in Multiple Regression”..
4. You can run a regression when both X and Y are discrete – the bigger question at this point would be why would you want to do this?

0
Viewing 5 posts - 1 through 5 (of 5 total)

The forum ‘General’ is closed to new topics and replies.