Shainin and Classical DOE – A Comparison

Six Sigma – iSixSigma Forums Old Forums General Shainin and Classical DOE – A Comparison

This topic contains 7 replies, has 5 voices, and was last updated by  Mikel 10 years, 6 months ago.

Viewing 8 posts - 1 through 8 (of 8 total)
  • Author
  • #38319

    Robert Butler

    In the time I’ve been reading and contributing to this forum there have been a number of threads and queries asking about the differences between Shainin methods and those of Experimental Design.  The on-line paper cited by Andy U in his post to this forum provides an opportunity to offer a comparison.  At the moment, the paper can be found at:

      In the event the link is severed, the title of the Paper is “Training For Shainin’s Approach to Experimental Design Using a Catapult” – Antony and Cheng – Journal of European Industrial Training 27/8 [2003] 405-412.
       The catapult problem described in the paper involves a search of seven (7) catapult variables in order to identify an optimum.
      The variables are:
                                              Worst       Best
       Ball Type:                      Light        Heavy
       Rubber Band Type :      Brown      Black
       Stop Position:                    1               4
       Peg Height:                        2               4 
       Hook Position:                   2              4
       Cup Position                      5               6
       Release Angle                  170           180          
      The comparison will consist of summaries of statements in the paper concerning the Shainin approach followed by the approach/techniques used in traditional design.
    The Paper – Preliminary Work
       In Phase I of the process the investigators develop a list of process parameters to be tested.  The selection is based on meetings with people from design, QC, and various levels and departments in manufacturing. The variables are identified by brainstorming and examining earlier process output. Once the factors are identified they are ranked in order of perceived importance and two levels +1 – the best, and -1 – the worst are identified for each of the variables of interest.
      In the paragraphs following the opening comments about Phase I the authors describe the initial variable search method which consists of running an experiment with all of the variables set at their best level and another with all of the variables set at their worst level.  A six experiment study of these two experiments is set up – the two experiments are randomized to the study so that each one is run three (3) times for a total of six (6) runs.
      At the end of this initial study the differences between the medians of the two experiments is computed and an estimate of the range of the output responses is derived.  In order to move on to Phase 2 of the difference of the medians (DM) of the two sets of three experiments must meet the criteria of DM > 1.25:1 – (pp.406) “If this condition is not satisfied, then either the list does not yet contain the critical or key process variables or else the settings of one or more of the variables must be reversed.”
    Experimental Design – Preliminary Work.
      A traditional design, like the Shainin approach, would take advantage of the results of a brainstorming session with everyone.  Since a process of this type is going to generate a lot of possible variables we would ask people to look at the final list and then independently rank them in terms of perceived importance.  The results of these independent polls would be tallied and the variables ranked accordingly. The final choice of the number of variables for screening would be driven by the cost/time constraints of the operation.  In keeping with the example of the paper we will assume the result of the brainstorming resulted in the seven preliminary variables as listed in Table 1 pp. 407.
      The position of traditional design is that it is rare that the best and worst settings are known.  Usually, what is known is a “nominal” setting (a setting at which the process makes useful product – it is this setting that would be the current focus of most process control efforts) and a “bad” setting – usually a setting known to have caused trouble in the past.  Rather than focusing on either of these, the usual approach is to ask about feasible extreme settings of the process variables.  Once these extremes are identified the questions shift to issues concerning the validity and feasibility of running the process at these levels.  
     With the extreme settings identified and agreed upon the next phase would be the construction of a screen design. 
      Of the variables of interest only Rubber Band is a true type variable. .  A good “traditional” design would be a near saturated 2 level, 8 point design for the six continuous variables with a single replication of one of the eight points. This 9 point design would be run for each of the two rubber band types (18 experiments). The design would be constructed and the settings of each experiment would be reviewed to make sure we hadn’t knowingly set up conditions that were either impossible to run or that would be catastrophic in nature. Note: strictly from the standpoint of investigator comfort I would also add a 19th point which would be an experiment with variables at the current “nominal”settings.
    The Paper – Initial parameter investigation
      After running the six experiments and computing confidence limits on medians using range estimates and finding a significant difference between the minimum and maximum median the investigation moves on to putting control limits, which they also call prediction intervals, around the Median for the variables set at their best settings (MB) and the Median for the variables set at their worst setting (MW). With these limits in place the next step is that of separating the critical variables from the unimportant ones.
      For this effort – called the “swapping phase” a 7th experiment is run where the most important variable (as ranked by the team) is “kept at its worst level while all the other factors are kept at their best level (pp.406).”  This experiment is repeated but the roles are reversed – now the important variable is put at its best level and the others are set at their worst. “If the output response values from the above two experiments fall within the median limits (by this I believe they mean within the limits surrounding either the best or worst median) , then the influence of the swapped process variable can be disregarded (pp.406)”. On the other hand, if the measured response falls outside these limits then the swapped variables or some interaction involving them can be viewed as significant to the process. This process is repeated for all of the variables of interest
      After these runs have been made (for the 7 variables of interest this would imply an additional 14 runs) the Shainin approach recommends “capping runs” which constitute runs confirming variable significance.  For these the variables identified in the swapping phase as being most important are set at their best levels and the remainder at their worst.  As before, the settings are then reversed for the important and unimportant and another experiment is run.  If the output responses are within the median limits the runs are declared a success. The wording in the paper is such that I believe they mean if the results of the best settings lie within the limits of the best median and the worst settings lie within the limits of the worst median then they can proceed.
    Experimental Design – Initial parameter investigation
      Once the 19 point design had been run it could be examined using standard regression methods. Over the ranges of the variables studied, the results of the analysis of the 19 experiments would identify those variables whose linear behavior significantly impacted the measured response and thus those variables which are critical to the process (the red X, pink X, and pale pink X of the Shainin approach).
      The X’s are normalized over the ranges they were actually run (as opposed to just blindly plugging in the ideal -1, +1 matrix). Then both backward elimination and forward selection with replacement are run.  In most instances, if the actual design is reasonably orthogonal, the two methods will converge to the same model.  Because the X’s have been normalized, their respective coefficients can be directly compared and the variables ranked according to the impact a unit change in the variable (from -1 to +1) would have on the response.
      The DOE equivalent of “capping runs” are identified by studying the regression equation and using its predictive capability to identify combinations of X’s whose predicted response is considered to be an optimum.  These confirming experiments are run and if the results are within the confidence limits of the predicted value it is reasonable for the investigator to reach the same conclusions as those provided by the results of the Shainin “capping runs”.
    The Paper – Factorial Analysis
      In the paper, the results of the initial work have reduced the variables from 7 to 3. The paper then uses a standard 2 level, three variable, full factorial design to investigate the main effects and their two way interactions.  The investigators ran 3 replicates on the 8 point design, computed the median values of the results of each experiment and evaluated the results using ANOVA and interaction plots.
    Experimental Design – Factorial Analysis
      There are many ways to analyze a factorial design and the Shainin approach to this part of the analysis differs from traditional methods only in its use of medians. Like the traditional design, Shainin methods appear to leave the choice of analytical tools to the discretion of the investigator. The authors chose to use ANOVA and interaction plots.
      My view is the authors choice of tools and replicates will give results that are much less than they could be.  Since the final three parameters are all continuous variables and since all of them have a center point I would choose to run a full factorial two level design with a replicate on the center point (10 point total).  I would analyze the results of this design to insure no significant curvature effect and I would take the resultant equation and test it to see if it had identified the optimum setting. I would also use Box-Meyers methods to check for variables impacting the process variation. 
      Only after this would I give any consideration to more replication and/or design augmentation.  The regression would identify significant interactions and more importantly it would provide quantity to the effects of the X terms in the model.  ANOVA will identify significant effects but you cannot take the direct results of an ANOVA and start turning knobs. Similarly, interaction plots are useful but if there are too many of them they can be difficult to use for process control.
    Some Final Thoughts

    The initial search for significant variables, as outlined in the paper, is logical and organized, however, it does not appear to be particularly efficient.  In particular, it would be very easy to set up combinations of “best” and “worst” setting for variables whose results were inconclusive.  As offered by the paper, the only remedy to this problem (pp.406) is  “If this condition (DM >1.25) is not satisfied, then either the list does not yet contain the critical or key process variables or else the setting of one or more variables must be reversed.” (Since at least one of the Stans, Vinny, Darth, and Andy U have given me the impression they have done a lot of work with Shainin techniques I’d be interested in your thoughts on this). Experiments run in this fashion usually have to be treated as “stand alone” studies which means if their results are inconclusive, you cannot combine them with other similarly run experiments to identify variable effects because of confounding.  It doesn’t take too many variables to turn a search of this type into a random walk through your design space.  By contrast, a well thought out screen design will let you know, with a high degree of certainty, whether or not the variables you have chosen really matter.
    Multiple Y’s – The paper only addresses the issue of a single Y.  Most efforts I’ve been involved in have at least 5 Y’s of interest and a large percentage of the work I’ve done has involved 10 or more.  As described, the initial variable search would seem to require an independent effort for each Y.  While you could measure multiple Y’s at each of the points in the preliminary study you would have to face the fact that the “best” and “worst” settings may be different for different Y’s. This, in turn would only add to the number of runs you would have to make. On a screening design, this is not an issue.  You run the design, you measure the Y’s and you build predictive equations for each Y.  From this you can quickly see which variable is significant for which Y and where “best” settings are in conflict.
     I apologize for the length of this post. I hope it has been of some value and I hope it will provide some understanding of the similarities and differences of the two methods.



    Mr. Butler,
    Well and masterfully done.   Thanks for the time taken and detail provided.   I for one am going to have to take some time and noodle it over to assure I fully comprehend everything said, but I saw nothing on the surface worthy of picking at.   Thanks again for your efforts and insights your postings are always informative and well considered.



    The Dm>1.25 is a surrogate for a F ratio and is statistically valid. In other words, this is the same as not getting a statistically significant F ratio, which would also give you an inconclusive experiment.
    The issue is one of modeling, the Shanin approach will only tell you significant factors and probably not be balanced so any conclusion in terms of Y=f(x) probably has a lot of error. The traditional approach will yield a linear relationship and is structured to be augmented with additional runs to describe non-linear relationships as well.



    BTW – Robert, thanks for taking the time to do this.
    In my opinion, variable and component search are best as troubleshooting techniques and should be taught to, and required of, all that do repair and failure analysis work.


    Sander van Wijngaarden

    Being a supporter of the Shainin tools myself I read the paper with great interrest. The fact that I am a suuporter is maybe shown through my website
    What I found strange is that only 3 out off the 7 variables were swapped. Altough the combination of the 3 variables gave a complete reversal and therfor could be the only important variables I would have taken the opportunity to learn more about the others. From my experience I am not so impressed by the judgement of engineers of ranking the variables.
    Further the variable search has a strong requirement that de direction of the effect of the variables must be known. There is always a possibility that this knowledge is not present.
    So untill there is no complete reversal with a single variable, I would complete the search.


    Drew Peregrim

    I saw this is an old post, but I was doing new research on the Shanin method.  The link to the article “Training For Shainin’s Approach to Experimental Design Using a Catapult” was severed, and the search engine is not showing any new link to the article.  Do you have a PDF copy you can send me?
    Drew Peregrim


    Robert Butler

      No, I’m sorry I don’t.  If no one else saved a copy I would assume you could order a reprint from the journal.



    New research – what does that mean?Get World Class Quality by Keki and Adi Bhote if you want to learn
    more about the Shainin approach. There are some useful tools.If you only want to know the significant factors and nothing more,
    Shainin’s approach is just fine. You will not have enough information
    for modeling y=f(x).

Viewing 8 posts - 1 through 8 (of 8 total)

The forum ‘General’ is closed to new topics and replies.