Thank you for your detailed comment. I admit that my investigation was for the most part intellectual. I began evaluating estimates of sigma on a known normal set of data and couldn’t stop. I agree that there is limited application to this work, and for the most part the classic standard deviation formula is pretty robust. I do think that the IQR/1.55 formula has some validity when one knows the data should be normally distributed, but an outlier is found. However,. I do want to look deeper into your comment that “formula A adjusted by dividing by c4” is good for all samples. Obviously a hole in my education. Thanks!

I will say that capability (Cp/Pp) studies in business get taken beyond evaluating the ratio. In the drive for “six sigma” capability, the tails of the data’s distribution become very important, which makes our assumption of a distribution of the data (and the calculation of variation) important. Of course, calculating the confidence limits of Cp or Pp, helps account for error due to small sample sizes.

I hope that my article along with insightful comments such as yours, generates a discussion in which we all learn new things.

Thanks!!

]]>Nice thought provoking article, thanks for sharing.

]]>My conclusion…I’ll keep with the Shewhart techniques for estimating s.g. for control charts.

I’d assume we all agree that there’s a wide degree of uncertainty of population s.d. based on small samples so I’m not sure if anyone could advocate strongly for any different calculation than the classic ones (I’m not focusing on control chart applications with the assumption of agreement)? I would think it would be neat to do studies of small sample calcs with these techniques to see if uncertainty would be any better.

Thanks for the read. Truly enjoyed it.

]]>You do include in the introduction that your well-detailed analyses of sigma estimates was spurred by a question on capability. In that case, you identify to critical characteristics of estimates:

“1. The center of the estimates would be equal to the true population sigma.

2. The variation of the estimated sigma would be tighter than the variation observed when using the classic formula for standard deviation.”

The first is also known as unbiasedness and the second as variation. Thus, we typically want an estimate that is unbiased with minimum variance. But sometimes it is not possible to have both characteristics, and so bias is often sacrificed to achieve less variation. The mean-squared error (MSE) captures this concept by combining the two characteristics into one formula. Using this condition, an estimate with the smallest MSE would be considered best.

In your study you assumed a normal distribution. The formula A estimate when squared is an estimate of the variance. That formula is unbiased for the variance. However, A is biased (for the standard deviation, especially for small samples. For example, for n = 10, A underestimates the true population value by 2.8%, for n = 5 it underestimates by 6%, and for n = 2 by 25%.

To remove the bias, we can divide the formula by what is known as the c4 factor (which depends on n). This factor and its related factors are used in control charts to remove the bias whether using the standard deviation or the range.

If the distribution is not normal, then determining an unbiased standard deviation is more complicated. This brings us to the point of using outliers with small samples. In your example of n = 7, an outlier (however determined) occurred nearly 15% (1/7) of the time. Is this truly an outlier or does it indicate that the distribution is not normal? Rather than using a test for outliers to exclude data perhaps it should be used to exclude the assumption of normality. Or, better, to collect more data.

Thus, I would conclude that formula A adjusted by dividing by c4 be used for all samples, if one assumes normality. However, one doesn’t need normality to calculate process capability, as it simply the percent or proportion (or probability) of results being within specifications. There are numerous ways of estimating that.

]]>Thanks again for the article!

]]>