DOE strategy

Six Sigma – iSixSigma Forums Old Forums General DOE strategy

Viewing 5 posts - 1 through 5 (of 5 total)
  • Author
  • #35013


    Has anyone investigated the effect of uncertain factor levels on P-values and interactions, such as when equipment is faulty, or when processes are inherently difficult to set?
    Further, is there a statistical term for the ability of a process to achieve a given set-point in terms of accuracy and precision? It seems to be kind of like a ‘process gauge capability.’
    I wonder what the effect of uncertain factor levels would be on interactions and P-values. If this is a potential problem, would averaging replicants improve the P-value and eliminate spurious interactions that in reality do not exist.


    Kim Niles

    Dear Andy:
    Since no one has commented I’ll give you my thoughts.  Take them with a grain of salt … :) 
    Regarding your first sentence, as I understand your question, DOE’s measure these issues as error and or how well the data fit the model depending upon the type of problem.  When the R^2 value is low then your data did not fit the model in some way such as when it is non-normal, etc.  When error is high then the changes you made to the control factors you’re testing either had little effect relative to other changes naturally taking place (uncontrolled), were not consistent as you suggest, weren’t tested over a large enough range, etc.
    I have read about using process capability analysis (i.e. CpK) on samples of CpK values … which I think answers your second sentence question even though I’ve never done it.
    Regarding your third sentence question.  I believe that averaging replicates is the same thing as having no replicates and using an average value per run.  This results in normalizing the data in accordance with the Central Limit Theorem and therefore might work as you suspect if the error in the data is affecting normality. 
    I hope that helps. 



    Dear Kim,
    You are most kind …
    Recently, I had an opportunity to study aspects of  a 3-D laser imaging system using a Box Behken design. My design started with eight replicates and I after final processing I wound up with about four per cell. (I don’t like using the cell average for a ‘missing value’, unless I have to …)
    After analysed my results using ‘blocking,’ my R-squared was very low – a high P-value. When I used Taguchi’s approach … averaging the replicates, the R-squared was greater than 80%.  (As you mention – no doubt due to the central limit theorem.)
    When we looked at our laser system in more detail, we realized that the z-stage might be faulty. (I wish we had performed some kind of ‘gauge capability’ on the process system before running the experiment, so as to determine how accurately and precisely we could achieve high and low set-points. As I’m sure you realised, multifactorial methods assume that the factorial levels can be set exactly/ This is why the direction of residuals are paralled to the response axis.)
    Now that the z-stage has been replaced, my client’s process uniformity is  much improved, and they can now target the process to achieve the correct resonant frequency at parametric test, although the excimer laser still suffers low poor availability. (The ability to target the resonant frequency is critical since the miniature, microwave antenna can’t be adjusted if it falls outside a certain band.)
    What really surpised me was that none of the alignment registration interactions that we assumed were present in our original design, or those predicted by the Anova analysis of the ‘blocked’ replicated have materialised. In other words, it would appear that imprecise factor levels can induce spurious interactions! Now to a process engineer this is a bad news indeed, because they are responsible for fixing the process, and the last thing one would want is to chase one’s tail looking for something that doesn’t exist!
    I thought the results were worth sharing with the members of this august forum because I am about to embark on a different career path – there seems to be little interest in Six Sigma in the UK and much of our manufacturing is relocating to China.
    Best wishes,


    Robert Butler

    Uncertainty in factor levels is very common.  There are many ways to check for and counteract the effects of noise in the X variables.  From the standpoint of regression you will have to look into the methods that are employed when there is error in both the X and Y variables.  
     In order to evaluate the impact that error in your X’s has on the terms that you may investigate you will have to do the following:
    1. Take the X variables as run and center and scale each of them to a -1 to 1 range.  The resultant matrix will no longer appear as a matrix of -1’s, 0’s, and 1’s but will have all manner of numbers between -1 and 1. 
    2. Generate the expressions for the curvilinear terms and interaction terms that you were going to investigate using this actual design matrix.
    3. Take any one of your responses and build the full model using the individual data points (do not average your results – this will only serve to make a bad situation worse) and examine the VIF’s (variance inflation factors) for each of the model terms.  The VIF’s will spot single collinearity relationships between the various terms in your model but they will not be able to identify multiple collinearities that may exist.  The general rule of thumb is a VIF of 10 or more (some prefer to use 5 as the limit) means that that term cannot be included in your model building efforts.
    4. To really do an adequate job of examining your as run design matrix you will need to do an eigenvector analysis of the matrix.  While I have used Minitab I don’t know if it has this capability.  Statistica, will give you the eigenvector matrix but it requires some interpretation for proper use.  The best program for assessing multiple collinearities that I know of is SAS. 
    The coding is
    proc reg data = datasetname;
    model dependent variable = independent variables/vif collin;
    Where the independent variables are the centered and scaled X’s of your as run matrix.  You will be able to see from the eigenvectors, the condition indices, and the matrix associated with the analysis, which X’s are unacceptably multiply confounded with other X’s and which cannot therefore be used in your model building efforts.  If you run the stepwise regression analysis with the reduced set of X’s that are “independent enough” from one another, you will find the resultant models are of value for optimization and control.
      The problem of “fuzzy design points” generally has the greatest impact on interaction terms and causes one or more of them to be confounded with other terms in the model and thus unusable in the model building effort.  If the error in the X values gets too extreme ( noise such that there is a reasonable probability of a “low” setting overlapping a “high” setting) then you may find that some of your main effects are inadmissible as well.
      If you don’t take these steps you will very likely generate models that behave exactly as you described – models with significant terms (both main and inteactions) which, when used for control or optimization, fail to do either.
    Robert Butler



    Thank you for your explanation. Fortunately, the problem has now been solved –  probably more through engineering judgement than statistical insight.

Viewing 5 posts - 1 through 5 (of 5 total)

The forum ‘General’ is closed to new topics and replies.