Today I was reflecting on a potential topic that could come up in a traditional project involving any test or DOE utilizing a ’p-value’ criterion – it actually did for me a few times in the past.

Hypothetically, say for example there is a process that has very low inherent process variation (process s is very low), with a very high Cp or Pp (depends on how you define s, but say 2.0 or greater), yet has a very low Cpk or Ppk (or Z for that matter – basically the process is hugging or is outside either the upper or lower customer acceptance limits, but very stable at this level), so the situation is characterized as a classical optimization problem.

A black belt performs a DOE, and finds a factor that has a P-value of 0.05 or less on the output. Using the information, the new factor setting is applied to the process, and the new Z value has barely improved. The black belt is frustrated, because the factor was ’supposed’ to be significant. What gives?

In this case, the effect of the factor was much smaller than the level of optimization needed to make a real difference. But the effect was large enough to make a statistical difference based on the low inherent variation level of the process. A quick check of the main-effects plot (and the coefficients in the analysis for that matter) compared with what’s needed to achieve the required optimization would confirm the situation.

So why would this happen anyway? I’ve seen some black belts “go by the numbers” (specifically p-values) without looking at the graphics….and without looking at the real picture of what’s needed at the output side of the project.

A great teacher once told me to look at the graphs first….he was right.

*You Might Also Like*

##### Share With Your Network

Robert…absolutely right. I think some black-belts get a little zealous, especially in the beginning stages of their training. So much stress is placed on statistical significance in some of the training I’ve seen, I think it’s sometimes easy for new black-belts to lose themselves in statistics…could be an explanation…

I guess my questions would have to be – where in the world did anyone get the idea for a "p-value criterion" as far as a means of adequately assessing the impact of a variable on the process?

As you noted, p-value is a measure of statistical significance only – it doesn’t tell you a thing about the actual physical changes you could expect to see in the output based on a unit change in the input.

If you have run a DOE and if you are interested in understanding the impact of a variable on your output a common practice is to use the methods of regression to identify direction and magnitude of change in the output relative to the input.

After signficance testing and final model construction the usual procedure is to perform a simple examination of the coefficients of the terms left in the regression equation. This examination tells you everything you need to know as far as the possible physical changes one could expect from a significant independent variable. If the physical changes are of no practical benefit then what you have demonstrated is that you measurement system is capable of detecting changes in the process that don’t matter.

Analytically speaking, using the p-value alone is like driving without looking out the windshield, all you know is your moving.

I was taught three very important principles when being trained. Look first for the practical significance, can I relay expect an effect. Does it make sense with any prior wisdom or knowledge?

Next I was taught to looking for the same relationship in a graphical output. Did it agree with my practical expectation.

Thirdly and most often lastly, the analysis of the DOE was considered, that being the analytical relationship. It too must agree with the expectation of practical knowledge and the graph of the data. We were taught this as Practical, Graphical and Analytical, PGA. This has Never failed me in ANY assessment.