# degree of freedom for DoE

Six Sigma – iSixSigma Forums Old Forums General degree of freedom for DoE

Viewing 29 posts - 1 through 29 (of 29 total)
• Author
Posts
• #48000

Raymond Tate
Participant

I am running a DoE with three factors and two levels and four replicates. Why is the degree of freedom for the factors equal to 1?

0
#160601

Dennis Craggs
Participant

Think of the total number of degrees of freedom as the number of effects or coefficients that can be estimated. For example, it is possible to estimate n coefficients with n data points. In DOE, the effect on a dependent variable of a factor moving from one level to another is being analyzed. In your problem, each factor has two levels, so a single coefficient is required to quantify the affect. One degree of freedom is used for this coefficient. If a factor had three levels, then two coefficients and two degrees of freedom would be used.I hope this helps and is clear.

0
#160602

Raymond Tate
Participant

Thank you very much.

0
#160612

The New MB
Member

DOF is  always n-1

0
#160614

accrington
Participant

No it isn’t. Who told you that?

0
#160617

Robert Butler
Participant

As an aside, if you haven’t already run the experiments I’d recommend you run your analysis after just one run of the design and then decide if you want to run even another full replicate.  As I’ve said on this forum many times the whole point of DOE is minimum effort for maximum information and, unless you have some really compelling reason for doing so, 4 complete replicates of a design is overkill.
There have only been a few times in my career when I’ve replicated an entire design and every time it was because I was ordered to do so.  In each instance I ran an analysis after the first run of the design and in each case the difference between what I learned from just analyzing the first run and analyzing all of the replicates was negligible.
If you have already run the experiments I’d recommend you analyze the results from just the first run of the design and compare your findings to the results of analyzing all four replicates and examine the differences for yourself.

0
#160626

Mayes
Participant

If the degree of freedom for the factors depends on the number of levels, then why is the degree of freedom for the error 24?

0
#160628

annon
Participant

Hey Robert,
It is my understanding that determining experimental error (EE) and the resulting SE is possible using replicates, or through invoking the scarcity of efforts assumption and thus, using the higher order interaction(s) as your source of error or noise.  That being said:

Are you reccomending not using replicates for this determination or simply that four is excessive?  Do you always need an est. of EE or SE for effective analysis?
How are you analyzing the effects and doesnt this effect your use of EE?  For example, ANOVA, Least Sq, etc
How do you determine which method of analysis mentioned above to use?  It would appear that all one needs to know is the effect, the std error, and the ratio of the two for simple screening and characterization.  For example, ANOVA seems a bit unecessary to me, requiring multiple assumptions and lacking robustness.   Is this accurate or am I missing something?
THANKS!!!!!!

0
#160631

annon
Participant

accrington,
I think you posted a helpful statistical link in a previous post….but i forgot it…I thought it was statsoft.com but when i tried it, all i got was home page for products and services, no educational material…any advice?  Thanks!

0
#160632

Dennis Craggs
Participant

Assuming you are running a full factorial experiment, then with 3 factors at two levels, the full factorial experiment requires 8 runs (2^3). With 4 replicates, there are 32 runs (8*32). Now, analyzing the full factorial experiments requires 1 df for the grand average, 3 degrees of freedom for the main effects (F1, F2, F3), 3 degrees of freedom for the 2nd order interactions (F1xF2, F2xF3, F1xF3), and 1 degree of freedom for the 3rd order interaction (F1xF2xF3). This uses up 8 degrees of freedom, leaving 24 degrees of freedom for the error term.

0
#160634

Raymond Tate
Participant

Thank you Dennis. Very clear indeed!!!!!!!!!!

0
#160635

Regression
Participant

Raymond,
In your original post, did you run the DOE as a General Linear Model with 3 main effects, but no two-way interactions and the three-way interaction? If so that explains why you had 1 degree of freedom per factor. In this case you should have 28 degrees of freedom for error which sums up to 31 degrees of freedom. If you run the DOE with 3 factors (at two levels), two interactions, and the three-way interaction you will get Chad’s results.

0
#160694

accrington
Participant

What was the topic?

0
#160708

annon
Participant

Shoot, I forget, perhaps referring to a question about R sq…but thats a wag…the site itself was fairly informative at a high level and I thought it was statsoft….but i cant seem to reproduce it….thanks again.

0
#160719

Dennis Craggs
Participant

The basic experiment is 3 factors with two levels for each factor. The number of runs required for a factorial experiment is 8 (2^3). This allows calculation of the grand average, 3 first order effects, 3 2-factor interactions, and one 3-factor interaction. Estimating these parameters does not leave any degrees of freedom for experimental error since 8 items have been estimated from 8 data points. There are two possible choices:
1) Consider Minitab’s ANOVA table. The right column shows the statistical significance of the factors. If factors are less significant than some critical value, perhaps 0.1, (i.e., p>0.1) then remove it from the analysis. Minitab will use the factor for error estimates.
2) Replication of the experiment to allow error estimation. Running a replicate will be sufficient. It provides 8 degrees of freedom for experimental error.

0
#160730

accrington
Participant

This site maybe? http://www.keithbower.com.
It has some good stuff in it

0
#160738

Robert Butler
Participant

Annon,
There are a several possibilites here. Which one you use will depend on time/cost/efficiency constraints.  If you are really pressed for time and money you can run the single design without replicates and use the estimate of the 3 way for your error.  I’ve had to do this in the past but it is not my preference.  A single replication of a single run – for a total of 9 runs will give you the estimate of the error term.
If all of the variables are continuous and if you have the time and money to spare my preference, in this case, would be to run two center points.  This gives you replication on the center, allows you to estimate and examine all of the mains, two way and the three way interaction, gives you an error estimate, and allows a check for possible curvilinear behavior.  In my experience a replication of one or two experiments in the design is all that is needed for an acceptable research effort.
All of the work I’ve done with DOE has had as its focus the development of a  predictive equation so the method of analysis has been regression.  One could use ANOVA to analyze a design but this would only tell you that a variable was significant. With additional work you could get some estimate of the magnitude and direction of the variables effect on the outcome of interest but why bother with all of the extra work when regression will give that information to begin with?

0
#160739

Dennis Craggs
Participant

Most of the discussion is around a 3 factor 2-level experiment. This does allow one to create a predictive equation that is linear in the significant terms. If the linear model is accepted, then an optimum point will reside on one of the test conditions. A more general approach is to use response surface methods that includes square terms to be examined. If significant, an optimum at can be calculated that will probably not lie at one of the test conditions.

Replication and/or pooling of statistically insignificant terms will allow one to estimate experimental error. But, just to select a term to use as error is dangerous.

0
#160744

Regression
Participant

great site … but only in america :-))))

0
#160747

annon
Participant

A,
Thats not it, but just as good!  Thanks again!
JB

0
#160750

annon
Participant

RB,
First, thanks so much.  Very insighful.  As usual, I have a couple questions:

So running a single replicate (ie one additional run) provides an adequate measure of Exp Error?  Is there not a concern with so few DOF?
Would combining a std 2 level 3 factor full factorial with center point with a follow up central component design (with additional center point) provide the necessary replication for EE calculation?  This would appear to allow a fairly straight forward approach to first order optimization as well as allowing for second order investigation and an EE calculation…all in app. 15 runs….does that sound about right?
THANKS!!

0
#160752

Robert Butler
Participant

To your first point you have to ask yourself the following:  What is the purpose of your DOE?  Are you running a design so you can get a great estimate of experimental error or are you running a design so you can identify those variables that are having a major impact on your process?
I suppose if you have a situation where every single term in the design – mains, 2 way, 3 way, curvilinear, etc. tests significant then you may want to worry about the number of degrees of freedom you have planned to allot to your error.  While I can’t say this will never happen I can say I have never seen anything remotely resembling this situation.  What this means is that all of the degrees of freedom associated with factors that aren’t significant get rolled into the final estimate of error for the study and the end result is a estimate of error with a reasonable number of degrees of freedom..
If you are uncomfortable with this idea then I’d recommend planning your experimental effort to budget for a single full replicate of the design.  Your strategy will be to first run a complete design and then pick one or two of the design points from the full replicate and run these.  Once the results the design + one or two are in – analyze your data.  I strongly suspect that the results of the analysis will cause all concerned to loose interest in further replication and focus instead on model confirmation and process optimization.
To your second point – you could certainly do that – essentially run star points of the design after running the factorial and the center.  Another possibility – if you are limited with respect to additional runs would be to take the factorial and the center point and use them as a basis for design augmentation with the augmentation asking for mains, two way, and curvilinear on all three factors.

0
#160753

annon
Participant

RB,
Thank you sir.  Much appreciated.

0
#160760

annon
Participant

Does anyone have a reference for how to exhauatively examine the residuals when looking for adequate fit?  I want to better understand what the different graphical patterns are telling me…
Is there a relationship between the value of the effects (main verses interactive) and the the order of the model?

Can one determine if the model displays primarily linear effects or if a second order model is required from an analysis of the effects alone, or is the residual analysis the dominant method?
THANKS

0
#160764

Lila STANCY
Participant

Could  you  guide  us  to  an  example  where combination  of ANOVA  & DOE does  exist,thanks

0
#160767

annon
Participant

You can see an example of DOE with ANOVA analysis in MTB help sections.

0
#160768

Robert Butler
Participant

Chapter 3 “The Examination of Residuals” – Applied Regression Analysis-2nd Edition – Draper and Smith is the best one chapter discussion of residual examination of which I’m aware (if there is a later edition I’d go with the same chapter heading in that version).
If you want a whole book – Residuals and Influence in Regression – Cook and Weisberg should keep you occupied.

0
#160769

annon
Participant

Thank you sir.

0
#163698