DoE Design Selection for Raw Material Screening

Six Sigma – iSixSigma Forums General Forums Methodology DoE Design Selection for Raw Material Screening

Viewing 9 posts - 1 through 9 (of 9 total)
  • Author
  • #247340



    My objective is to develop a chemical process that has a raw material (A), a supplement (B), a catalyst (C) and the response variable is yield (R).

    I have 15 difference raw materials (A1, A2… A15), 4 catalysts (C1, C2, C3, C4) and 8 supplements (B1, B2…B8). I would like to design a DoE where I can find the combination of A, B and C that gives me the maximum possible productivity.


    Robert Butler

    You will have to provide some more information before anyone can offer much of anything in the way of advice.  You said, “I have 15 difference raw materials (A1, A2… A15), 4 catalysts (C1, C2, C3, C4) and 8 supplements (B1, B2…B8). ”


    1. Is this a situation where the final process can have combinations of raw materials, catalysts and supplements or is this a situation where you can only have a raw material, a catalyst and a supplement?

    2. If this is the first case and you have absolutely no prior information concerning the impact of any of the raw materials, catalysts, or supplements on your measured response and you have no sense of any kind of a rank ordering of raw materials, catalysts, or supplements as far as what you think they might do then your best bet would be to construct a D optimal screen design with 27 variables – this would probably be a design with a maximum experiment count of 40 or less.

    a. If you do have prior information and you can toss in any combination from raw materials, catalysts, and supplements then I’d recommend building a smaller screen design using only those items from raw materials, catalysts, and supplements that you suspect will have either a minimum impact or a major impact on the response.

    3. If this is a case where you can only have one from column A, one from column B and one from column C is there any way to characterize the raw materials, catalysts, and supplements with some underlying physical/chemical variable that can will allow you to treat the raw materials, catalysts and supplements as continuous variables?

    a. For example – let’s say the raw materials are from different suppliers but they are all the same sort of thing (plastic resin for example) and their primary differences can be summarized by examining say their funnel flow rates, the catalysts are all from different suppliers but they can be characterized by their efficiency at doing what they do and the supplements can also be characterized by whatever aspect of the final product they impact – say viscosity values.  If you have this situation then what you have is a situation with three variables at two levels.  In this case the lowest level for A would be the raw material that had the least of whatever it is and the high level would be the raw material that had the highest level of whatever and so on for catalyst and supplements.  This would be a design of about 10 experiments – a 2**3 with a couple of repeats.

    4. If there isn’t any way to characterize the variables in the three categories and you have to treat everything as a type variable then you are stuck with having to run all combinations – 480.



    The process has one raw material, one supplement and one catalyst: RM (A) + Supplement (B) + Catalyst (C) –> Yield.

    I have 15 raw materials available from 15 different suppliers: each raw material has a different composition (not known for Intellectual Property reasons) and are identified with a proprietary name.

    Same is the case with the supplements: I have 8 different supplements from 8 suppliers, each supplement has a different composition.

    And, I have four catalysts that I have synthesized and are named as 1A9, 2C4, 3BD, 6F3.

    As all of them have names, I have to treat all of them as categorical variables?

    As per my viewpoint, there are two options for representing the levels:

    a) Categorical: (+, present) and (-, absent)

    b)Continuous: I can choose the levels based on the concentration of the raw material like (1X) and (3X).

    I would prefer categorical factors than continuous factors because for continuous factors I have to make two concentrations of each raw material and supplement. For categorical, I have one raw material and supplement concentration either present or not present.

    I don’t have any prior information in the first place about how these are going to affect the response variable.

    The number of runs is not a constraint for me as long as yield is maxed out. If I had to perform 480 runs, That’s Ok but I have to divide them into 10 blocks or so; each block will be executed each day.

    I would like to appreciate your prompt reply.

    Thank you very much



    Robert Butler

    If you have a situation where you can only run one raw material with no combinations of other raw materials and similarly for the catalysts and the supplements and you have no underlying variables that would allow you to convert them the continuous measures then you are stuck with running all the possible combinations which would be 480.  I understand the need for blocking so what you will want to do is draw up the list of the 480 combinations, put them in order on an Excel spread sheet – go out on the web and find a random number generator that will generate without replacement and generate the a random number sequence for 1-480.  Take that list, enter it in a second column in the Excel spread sheet so there is a 1-1 match between the two columns, sort on the second column so now 1-480 are in order and use that template to guide you in your selection of experiments (the first 10, then the second etc.)

    As a check it would be worth randomly adding into the list a few of the experiments (again do a random selection) to permit a check with respect to run-to-run variability.





    I can convert the raw materials and supplements into continuous numbers by measuring one of their properties like Total Nitrogen, etc. In that case, what would be the option?


    Robert Butler

    That would make sense if the aspect of the supplements you believe most impacts the outcome is the percentage of total nitrogen, otherwise it won’t matter.

    The idea is if total nitrogen was understood to be the critical component of the supplements you would chose the two supplements with the lowest and highest levels of total nitrogen and run the analysis on the basis of nitrogen content. This would allow you to treat the supplements as continuous.

    Another option would be sub-classes of raw materials and supplements.  In that case you might have some raw materials with an underlying continuous variable of one kind and others with a different one.  If you have this case then you would do the same thing as mentioned above.  Of course, this would mean building experiments with combinations of raw materials and supplements.  My guess would be you couldn’t do this with the catalysts so they would have to remain categorical variables.  However, if you can express the raw materials and supplements as functions of critical underlying continuous variables and can mix them together then the number of experiments would be drastically reduced.

    For example, let’s say your raw materials fall into three groups with respect to a critical continuous variable and your supplements can be grouped into 4 separate groups – you could put together an 8 run saturated design with these and then run the saturated design for each one of the separate catalysts. That would give you a 32 run minimum.  If you tossed in a couple of replicates (I’d go for a random draw from each of the 4 saturated designs so I had one replicate for each catalyst) you would have 36 runs and you could get on with the analysis.


    Joseph E. Johnson


    Being a PhD Chemist, my thought would be to NOT do a massive DOE in a shot-gun approach and waste time, effort and money.  THINK about what the goals (final product and yield) are and understand the chemistry behind the materials, their interactions, and processes.  This will involve knowing the materials and chemistry behind the reactions- usually having knowledge in this area and reading background literature of the reactions, handling conditions, and costs.  With that in mind, pick and narrow your RM,supplement, and catalyst down- as well as amounts, order of addition, and reaction conditions and then revisit doing a limited DOE.

    If the project is exploratory or time-sensitive- e.g., a biochemical or medical material, then look at automated, mico-studies using fractional factorial experiments. See an effect(s) and focus on a full-factorial after that.



    Ted Karr


    Great reply. I would also ask for a definition of productivity.  Max output or max output at minimal cost or throughput times?

    Yes, you absolutely must know if the A,B, and C items are 1 per batch or is there a combo of raw materials? Additionally, do the catalysts function with all the raw materials for their desired effect? If not, then there will be constraints on the DOE factor levels.

    To roughly apply a Shainin method you may want to run all factors at their lowest cost level, at the mid level cost, and at the high cost end. After all, the typical goal of DOE is to develop a product that meets specifications at the lowest cost. Those 1st 3 simple experiments could put you in the ballpark for future experiments. For instance, if the middle cost catalyst works the best, you can now focus on using the 2 catalysts in the mid cost range. The same thinking can be applied to your raw materials. As a result you can cut all of your factors in half with this approach.

    These ideas are based on supplied information. They may be invalid depending on supplied future details.

    • Ted

    Shamshul othman

    I believe your concern is on the many levels of material and supplement. Naturally in Six Sigma before we do Doe we must have completed our Analyze phase first. By doing analysis not just we are going to understand which variables are significant and otherwise, but also we will be able to understand how the different levels of the variables are affecting our output variable. So by sorting the levels of Material and Supplement on descending or ascending order by correlating it to your yield we can choose two levels of high and low for our DOE. From these two levels you will be able to optimise once you understand the vector of your coefficients and variables.

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic.