# Why do we care about df?

Six Sigma – iSixSigma Forums Old Forums General Why do we care about df?

Viewing 10 posts - 1 through 10 (of 10 total)
• Author
Posts
• #35582

Dev Shrestha
Participant

Some one still need to explain, whay it is important to consider so called degree of freedom? Say in ANOVA?

0
#100462

GGBB
Participant

If you were thristy and you wanted something to quench your thirst, and you had water, juice or beer in ur fridge, the outcome of your behaviour would be quite different than if all you had was some water…. hope you’r with me till now.
Whats your degree of freedom in the latter scenario…none you drink water. But in the former scenario.. you v got more options, your df is higher.
So??
Well same is the outcome of all this other things like MSS,SS, etc. that you see in say Anova, when this df thing changes.
Lets take a simple case, statistics is mostly about prediction, if it told you that in (a+b+c)/2=d we had an idea about a and b and c, that tells you all u need about d, but if if were to tell you we had no idea what say c was , now you have got like d misbehaving all over again, and so on and so forth, the greater the degree of freedom of things, the different are the implications of the outcome… that shall do for now… if you dont bother about it, well, the results wont bother to help you either. I suggest u read Statistics for Managers by Levin et al.
All the best and do take care of the df.
GGBB

0
#100464

Hemant Gham
Participant

Why Degrees of Freedom or n-1?

The use of DF or n1 can be considered as a mathematical device used for the purpose of deriving an unbiased estimator of the population variance.

n1 is referred to as degrees of freedom.

When the total sums-of-squared deviations is given and the pair-wise deviation contrasts are made for n observations, the last contrast is fixed; hence, there are n1 degrees of freedom from which to accumulate the total.

More specifically, degrees of freedom can be defined as (n1) independent contrasts out of n observations. For example, in a sample with n = 5, measurements X1, X2, X3, X4, and X5 are made. The additional contrast, X1X5, is not independent because its value is known from

(X1X2) + (X2X3) + (X3X4) + (X4X5) = (X1X5)

Therefore, for a sample of n = 5, there are four (n1) independent contrasts of  degrees of freedom. In this instance, all but one of the contrasts are free to vary in magnitude, given that the total is fixed. But when n is large, the degree of bias is small; therefore, there is little need for such a corrective device.

Example: An experiment was run on two different days, Monday and Thursday. The company involved three operators O1, O2, O3. The Wear testing process was run at three settings A, B, C. Each operator made two determinations of wear at each day and setting (Mixed Model ANOVA).

So we have following DF.
Source   DF
Day             1    (2-1) Here we have only one independent contrast (Mon-Thu)
Operator     2    (3-1) Here we have two independent contrasts (O1-O2) + (O2-O3)
Setting        2    (3-1) Two independent contrasts (A-B) + (B-C)

0
#100465

Hemant Gham
Participant

For e.g. n1 = n – 1
Also read n as ‘ n ‘ and
(X1X2) + (X2X3) + (X3X4) + (X4X5) = (X1X5) as (X1-X2) + (X2-X3) + (X3-X4) + (X4-X5) = (X1-X5)
Regards,
Hemant

0
#100478

Ken Feldman
Participant

The two responses so far have described what df is.  I believe the original poster asked why does he/she care about it.  What actions are taken one way or the other based on the df?

0
#100486

Hemant Gham
Participant

Darth,
What and Why both are answered.
Regards,
Hemant

0
#100490

Bart
Participant

In many uses, such as ANOVA, the degrees of freedom are essentially an “adjusted” measure of sample size for the sample used to estimate the respective variance(s).
If you divide the squared deviations from the mean by n instead of (n-1), you will get a biased estimate of the population variance. That means that as the sample size gets larger and larger, the estimate will tend toward SQRT(n-1)/n)*sigma, rather than sigma. By dividing by (n-1) instead of just n, we create an unbiased estimate of the population variance – one that will tend toward the true value of sigma.
In the ANOVA, the F-statistic is simply a ratio of variances, so each of these variances needs to be calculated using an appropriate degrees of freedom – so they are not biased.
Keep in mind that the idea behind most statistical tests is that we find a test statistic whose distribution can be predicted under the null hypothesis assumption. If the test statistic calculated from your data shows sufficient evidence of not coming from the hypothesized distribution, then we may decide that the null hypothesis is not true.
In the case of ANOVA, the test statistic is the ratio of two varainces. The degrees of freedom are a necessary bit of information when trying to predict the distribution of the F-statistic under the null hypothesis assumption that there are no significant model terms (e.g. no differences between the means in a one-way ANOVA).
The same thing can be said about predition of the t-statistic under the null hypothesis for a t-test.

0
#100494

Sean
Member

You have clearly explained the “what”, but not the “who cares”.  You state that it is used for the “purpose of deriving an unbiased estimator of the population variance”, but the questions “so what?” and “who cares?” still stand.  Please explain further if you could…
SP

0
#100535

Bart
Participant

If the estimation of sigma is biased, then the calculated F-statistic will may not follow the theorized F distribution, even if the null hypothesis is true. The result could be a much greater tendency to reject the null hypothesis than expected. You’ll see differences where differences don’t exist.

0
#100541

abasu
Participant

Think of df as statistical currency.  You have to spend a df to get information such as 1 df to calculate sample mean and so on.
One instance you care about df is while analyzing single replicate DOEs.  Minitab (or other software) will not generate p-values for the different terms.  So one trades df associated with higher order interations to get an estimate of experimental error and P values.

0
Viewing 10 posts - 1 through 10 (of 10 total)

The forum ‘General’ is closed to new topics and replies.