Home › Forums › General Forums › Tools & Templates › Non Parametric – Which Factor Has Greatest Influence
This topic has 3 voices, contains 3 replies, and was last updated by
Robert Butler 288 days ago.
| Author | Posts |
|---|---|
| Author | Posts |
| August 7, 2012 at 8:54 am #184549 | |
|
Chris Richardson @chrisrichardson1973 Reputation - 32 Rank - Aluminum
|
I am currently working on a problem with results in terms of % scrap, hence the results are non normally distributed, we have 9 potential hypothesis that have shown statistical significant difference between the groups when using Moods Median testing. |
| August 7, 2012 at 10:04 am #184551 | |
|
Robert Butler @rbutler Reputation - 2150 Rank - Silver
|
When you say “we have 9 potential hypothesis that have shown statistical significant difference between the groups” what exactly do you mean? a) It sounds like you are saying you have 9 possible variables which, in a univariate setting, have exhibited some kind of significant correlation with percent scrap. b) If this is the issue and if by “interdependancies between each of the factors” you mean that they are not perfectly orthogonal then there are various ways to attack your problem. One possibility: Given that a) and b) above are true you would first want to test the X matrix (you factor matrix) to see if the variables of interest are sufficiently independent of one another to permit inclusion in a multivariable regression. To this end you would want to look at VIF’s and eigenvalues/condition indices. If the factors exhibit acceptable independence then you could run a standard regression analysis (backward elimination and stepwise) to check for variable significance. If, on the other hand a) and/or b) are not true then if you could provide additional details perhaps I or someone else could offer some additional thoughts. |
| August 8, 2012 at 3:30 am #184591 | |
|
Chris RIchardson @chrisrichardson1973 Reputation - 32 Rank - Aluminum
|
Hi Robert, Thanks for your reply, I have a % scrap value and 9 potential root causes, all of which are attribute type values, the % scrap values are not normally distributed therefore I have carried out moods median tests on each of attributes, most of which are returning statistical significant differences between the groups. I’d like to understand which of the 9 attributes have the greatest impact on the % scrap figures to identify where to start the improvements. I’d also like to understand if there are any interdepedancies between the 9 parameters. |
| August 8, 2012 at 5:27 am #184593 | |
|
Robert Butler @rbutler Reputation - 2150 Rank - Silver
|
It would appear that what you are dealing with is an understanding of statistical methods at the intro course boilerplate level. It’s ok as far as it goes but your problem has pushed way past whatever you learned there. Additional Information: 1. When discussing regression non-normality is not an issue as far as the Y’s or the X’s are concerned. In regression, the only place normality enters the discussion is when talking about the residuals. 2. In regression X variables may be continuous, ordinal, or nominal (attribute). The way to proceed is as follows: 1. Take your attribute measures and code them using dummy variables. poor: dumbvalue1 = 0, dumbvalue2 = 0 and the model we fit is response = fn(dumbvalue1, dumbvalue2) If you code your attributes in this manner what you will then need to do is assess the X matrix of the dummy variables in the manner outlined in my first post. Once you know what you can and cannot include in the multivariable analysis you can run the regression analysis as mentioned before. With dummy variables you cannot look for curvilinear terms (dummyvalue1*dummyvalue1) but you can look for interactions between the dummy value terms for DIFFERENT attributes in the model. Thus if you had dummy values for “value” and you had another variable “boss happiness” you could look at the interactions of the dummy values for these two (assuming the test of the X matrix indicated the interactions were acceptably independent of the other terms in the multivariable study). As for using R2 to understand which of the factors might have the greatest influence on %scrap – don’t. Run a regression analysis (as opposed to just letting the machine run a regression) and assess the final results by examining the residuals and looking at prediction error. Before trying any of this I’d recommend a little review of regression methods. Get a copy of Regression Analysis by Example by Chattergee and Price through inter-library loan and read the chapters on Qualitative Variables as Regressors (this is all about dummy variables and their construction), and Analysis of Collinear Data (this is about assessing the X matrix). Given that you have been away from the stats for 8 years I’d also recommend reviewing the chapters on Simple Linear Regression and Detection and Correction of Model Violations: Simple Linear Regression. The book is well written and, because the examples include the data, you can put the data on a spreadsheet and run the analysis while following the discussion in the text.
|
© Copyright iSixSigma 2000-2013. User Agreement. Any reproduction or other use of content without the express written consent of iSixSigma is prohibited. More »