Would like to reply to Drew that Six Sigma implies six and not three standard deviations between mean and each specification limits. In case you think that there are 3 SD between mean and closest specification limits than what would happen in case there is only one specification limit(USL or LSL). In such case the process would never be six sigma if there are only 3 SD between mean and closest specification limit.

]]>Nice article to begin the discussion of ANOVA.

In addition to the comment previously, I am not sure your definition of “six Sigma” is correct.

“(“Sigma” is the statistical notation used to represent one standard deviation; the term “Six Sigma” is used to indicated that a process is so good that six standard deviations – three above and three below the mean – fit within the specification limits)”

I believe that six sigma refers to the number of standard deviations that fit between the mean and the closest specification limit. ]]>

Test for equal variances: this is one of the assumptions associated with ANOVA, and should be performed. This is where you may be concerned with the low sample sizes that Norbert indicated. There may be an interesting story with Site B: they have the highest mean, but they appear to have the least variation. This could point out one of those practical differences as well.

Multiple comparisons: this can help narrow frame the identification of practical differences. One note of caution: Minitab does return confidence intervals for the groups, but these are individual CIs. Formal multiple comparisons assists in identifying what groups actually differ based on the “family error rates.” I believe Minitab offers the ability to explore multiple comparisons

]]>Anova is a great tool in case of attributive X’s and quantitative Y.

My issue is alway the very low sample size per subgroup. A sample size of 5 seems very low. As I learnt in order to be able to estimate standard deviation properly we need minimum 10 datapoints per subgroup and I always encourage green belts do do so. In case of a sample size of lower than 10 I would always look at the range.

I know it is just an example and I fully agree with the point:

Something that is significant statistically has to be significant practically and vice versa

Regards,

Norbert

]]>U have explained the graph very clearly, Keep this ]]>

it’s not clear how the box plots were constructed but the default setting for box plots is that the box covers the middle 50% of the data from the 25th percentile (or 1st quartile) to the 75th percentile (or 3rd quartile). In this case, it is wrong to say that they represent +/- 1 standard deviation as stated in Figure 2–unless these were constructed specifically to be that. My software shows the same box plots so the boxes are not one standard deviation from median or mean (although it is quite close).

However, if they were, using +/- one standard deviation would not be a test at 95% confidence level and the test of means would use not the standard deviations of the data for each group but the standard error of the means, which is the data standard deviation divided by the square root of n.

More importantly, the box plots are unnecessary as the two analyses are included in the output shown in Figure 1. You correctly stated that the p value is 0.007 indicating that there is a statistical difference among two or more groups. Since we have three groups, we must then determine which groups are different.

To see which means are different you need to look at the confidence intervals shown at the bottom of Figure 1. They are the dotted lines with parentheses at each end and an asterisk in the middle, indicating the sample mean for that group. The means that do not overlap are statistically different.

Thus, we see that Site B mean is statistically different (at 95% confidence) from Site A mean but NOT statistically different from Site C mean. Site A and Site C means are also not statistically different.

Now we have a not uncommon conclusion with statistical analyses: Site B mean is not different than Site C mean which is not different than Site A mean. Doesn’t that imply that Site B mean is not different than Site A mean? No, because we are only testing whether we can see a difference not whether they are mathematically identical.

Think of it this way using colors. White is not the same as black. But we can show shades of gray from white to black such that any two consecutive shades are not visibly different to you or me. To see whether they are different we would need a more powerful “eye.” Similarly, in this case Site B mean may be actually different than Site C mean but the sample size was too small to have that powerful enough “eye.”

One minor comment: The title of the article is “When Does a Difference Matter?” I was expecting a discussion on practical significance and not on statistical significance. Perhaps the next article.

]]>David

]]>