Most Practical DOE Explained (with Template)

Key Points

An eight-run array is the most practical means of constructing a DOE.
You can do this with Microsoft Excel, but dedicated math software is better suited for the task.
There are multiple forms of DOE, each with an intended use.

For purposes of learning, using, or teaching a design of experiments (DOE), one can argue that an eight-run array is the most practical and universally applicable array that can be chosen. There are several forms of and names given to the various types of these eight-run arrays (e.g., 2^3 Full Factorial, Taguchi L8, 2^4-1 Half Fraction, Plackett-Burman 8-run, etc.), but they are all very similar.

Generic Steps for Creating a Practical DOE

Senior female ceo and multicultural business people discussing company presentation at boardroom table. Diverse corporate team working together in modern meeting room office. Top view through glass — ©Ground Picture/Shutterstock.com

Determine the acceptance criteria you need (i.e., acceptable alpha error or confidence level for determining what you will accept as passing criteria). This is typically alpha=.05 or 95 percent confidence for solid decisions; see additional advice below.

Pick 2-3 factors to be tested and assign them to columns A, B, and C as applicable (use the key provided).
Pick 2 different test levels for each of the factors you picked (i.e., low/high, on/off, etc.).
Determine the number of samples per run(room for 1-8 only; affects normality and affects accuracy, not confidence).
Randomize the order to the extent possible.
Run the experiment and collect data. Keep track of everything you think could be important (i.e., people, material lot numbers, etc.). Keep all other possible control factors as constant as possible as these may affect the validity of the conclusions.
Analyze the data by entering the data into the yellow boxes of the spreadsheet and reading the results. A review of the ANOVA table will show you those effects that meet the acceptance criteria established in step number one. If the alpha value in the table is greater than the acceptance criteria, accept the result; if it is less, reject the result. Similarly, the higher the confidence, the higher the probability that that factor is statistically different from the others. Signal-to-noise measurements are helpful to use when selecting factors for re-testing in subsequent experiments.
Confirm your results by performing a separate test, another DOE, or in some other way before fully accepting any results. You may want to more closely define results that are close to your acceptance criteria by retesting the factor using larger differences between the levels.

How Does a DOE Work?

Note that using the eight-run array, we have four runs being tested with each factor at high levels and four without being at a high level. We have the equivalent of eight data points comparing the effects of each high level (4 high + 4 not high = 8 relative to high) and vice versa for each factor and the interactions between the three factors.

Therefore, using this balanced multifactor DOE array, our eight-run test becomes the statistical equivalent of a 96-run, one-factor-at-a-time (OFAT) test [(8 Ahigh)+(8 Alow)+(8 Bhigh)+(8 Blow)+(8 Chigh)+(8 Clow)+(8 ABhigh)+(8 ABlow)+(8 AChigh)+(8 AClow)+(8 BChigh)+(8 BClow)].

Other advantages to using DOE include the ability to use statistical software to make predictions about any combination of the factors in between and slightly beyond the different levels, and generating various types of informative two- and three-dimensional plots. Therefore, DOEs produce orders of magnitude more information than OFAT tests at the same or lower test costs.

DOEs don’t directly compare results against a control or standard. They evaluate all effects and interactions and determine if there are statistically significant differences among them. They also calculate statistical confidence levels for each measurement.

A large effect might result but if the statistical confidence for that measurement is low then that effect is not believed. On the other hand, a small measured effect with high confidence tells us that the effect isn’t important.

Why Use This Resource?

You might be asking why you’d need a practical DOE guide. However, you could spend the time generating personal resources and trying to instruct your team on their usage. Save a little time and effort and just trust the process. This is vetted by industry veterans and introduces new users to the concepts behind DOEs with ease.

Why This Array?

Give Trainees DOE Experience with Simulated Experiments

Selecting a test array requires balancing your test objectives, test conditions, test strategy, and resources available. Therefore, it is usually more advantageous to run several DOEs testing only a few factors at once than one large DOE.

Comparing the statistical power of this array (inherent ability to resolve differences between test factors; 1-b) with the cost of experimenting (number of runs needed) also shows how this array is advantageous since it requires only eight runs and yields successful results in most situations.

Picking Acceptance Criteria

Acceptable confidence depends upon your needs. If health could be affected, then you may want more than 99 percent confidence before making a decision. This author’s rule of thumb is that 51-80 percent is considered low but in some cases is worth considering (see precautions below).

A confidence level of 80-90 percent is considered moderate with the results likely to be at least partially correct and a confidence level greater than 90 percent is considered high with the results usually considered very likely to be correct.

Calculating Samples Per Run

Since statistical confidence doesn’t increase with additional samples per run (replications are needed to have that effect), it is important to remember that additional samples per run are only needed when concerns of non-normal data exist (per the central limit theorem) and or to improve measured effect accuracy (significance of changes between test levels).

Since test costs and/or knowing the statistical confidence in our effects is usually more important than statistical significance and normality effects, running two to three samples per run is usually ideal.

Other Useful Tools and Concepts

Do you want to learn more about DOEs? You’ve come to the right place. Learning how Plackett-Burman Experimental Design factors into your DOE can be a lifesaver. This method screens out the chaff and lets you hone in on the most important factors to focus on in your experiments.

Additionally, learning how full factorial design works can show the impact of variables on your outputs. This isn’t for beginners but is something that certainly merits attention for anyone conducting their DOEs.

Precautions

Since this array has less power than others available, we need to remember that when optimizing a process that isn’t critical to human safety, using test results with a low confidence level can often be much better than not knowing which way to go with machine settings, etc.

Assuming all the errors in one’s experiment are evenly distributed (random distribution of error), a confidence level of 60 percent (measured to be true via DOE), for example, might seem horrible but means the equivalent of 80 percent, since the 40 percent that we are unsure of could go either way [60 percent + (40 percent / 2) = 80 percent].

Most Practical DOE Explained: Tips and Tricks to Design Your Experiments

Key Points

Generic Steps for Creating a Practical DOE

How Does a DOE Work?

Why Use This Resource?

Why This Array?

Picking Acceptance Criteria

Calculating Samples Per Run

Other Useful Tools and Concepts

Precautions

About the Author

Kim Niles

Key Points

Generic Steps for Creating a Practical DOE

How Does a DOE Work?

Why Use This Resource?

Why This Array?

Picking Acceptance Criteria

Calculating Samples Per Run

Other Useful Tools and Concepts

Precautions

Join 65,000 Black Belts and Register For The Industry Leading ISIXSIGMA Newsletter Today

About the Author

Kim Niles