DoE with Skewness as a response

Six Sigma – iSixSigma Forums Old Forums General DoE with Skewness as a response

Viewing 7 posts - 1 through 7 (of 7 total)
  • Author
  • #29021

    Ovidiu Contras

    Hi everybody ,
    The response for the DoE that I’m preparing is a distribution (overall response is given by the values of individual components, which are critical).
    What I try to find is a response that has no outliers (from experience ,data is always skewed-not normal).
    Can I use skewness as a response ,and if yes ,how it should be interpreted : as attribute or as continuous data ?
    Your feedback is much appreciated .


    Ø6 Sigma BB Coordinator

    It is strange to me. However, you may be right.
    Why do try some other responses such as S/N ratio, S, lnS, log(S).
    I also suggest my new response : Sum squre of (meadian-Xi).
    Do not neccessary to believe me.
    Six Sigma Black Belt Coordinator


    Mike Carnell

    Have you looked at the residuals? It sounds like you are missing a factor. The residual analysis can help figur it out.


    Jim deVries

    Just a few added thoughts…
    Looking at the signial to noise is a great approach, especially if you have already taken the data.  However, by including special causes in a model you may mask potential factors.  Therefore, this approach may not always yield representative process results since you will never know if you were able to cover all of the special cause and common cause permutations for your created DOE run.  I suggest that you try to identify special causes and eliminate them prior to starting DOEs because the cost.  Since most special causes normally cause more “pain” to the process, you would want to eliminate them first.
    In addition, technically speaking DOE (backwards ANOVA) should be defined as common cause variation problem.  ANOVA auumes normal and equal variance data – which in turn really addresses common cause variation; not special cause variation as you have mentioned.
    I would suggest to identify the special causes by either looking at the normality of the raw data (or looking at the residuals – as was previously discussed).  I would try to separate your project into two separate initiatives (special and common).  Let the DOE work on the common cause variation.  Special causes should be treated one-by-one.  By performing the typical analyze tools (box-plots and normality) special causes can be identified and addressed.
    If your data is “naturally” a skewed distribution (Weibel, Poisson, etc.) then you also have the option to transform the output data prior running the data through the DOE tool.  This should help insure that your analysis is correct.
    As stated previously, always check to see if your model is satisfactory looking for:
                residuals:  homoscdacity, special cause, patterns
                Practical Significance less than 30% (ss error / ss total)
                VIF and Correlation also should be checked for multicollinearity
    I hope that this helps.


    Ovidiu Contras

    Thanks for the feedback !
    What I probably missed  explaining ,is that each measured individual response is a distribution ( for a full factorial ,3 factors ,2 levels , I will have 8 responses to measure – 8 distributions). The overall individual response consists of smaller responses connected in series .


    Jim deVries

    This is a little more complicated than I thought.  Without seeing the actual data, it is difficult for me to comment much further. 
    The only other suggestions are to make sure that you have created a Y = f(x) or CT Tree (Drill down of your how your different x’s are related to the 8 Y’s that you have indicated). 
    It depends if the Y’s are independent or dependent. 
    Once the above is known and established you could go through the steps outlined on my previous e-mail.
    Maybe so other folks have some suggestions. 
    I would need to see the data to better understand how to help.


    Robert Butler

      You can use anything you want for a response in a DOE.  The issue of normality only comes into play when you are developing your regression equations and wish to test for levels of signficance of the factors.  Then, the issue of normailty focuses not on the responses or on the independent factors but on the normality of the residuals. Indeed,
     If the Y’s are not independent of one another this will be reflected in the terms entering your regression models.  You will discover (assuming that your X’s are indeed independent of each other) that Y’s that are not independent will tend to have the same terms in their correlation equations and the magnitudes and signs of the respective coefficients of the X’s (given that these have been normalized to a -1,1 range) will be similar.
      For a discussion of these issues I would recommend pages 22-33 of Applied Regression Analysis Second Edition by Box and Draper.

Viewing 7 posts - 1 through 7 (of 7 total)

The forum ‘General’ is closed to new topics and replies.