# Unequal subgroups sizes

- March 15, 2010 at 10:07 pm #53359

Doa

be notified via email.Hi To All,

I have a general question regarding data analysis, I am beginning a project to improve an exisitng process given 118 samples of variable data, when performing the intial analysis X bart R chart etc., selecting sub group size of three Minitab spews out the appropriate charts with a qualification “test performed with un equal samples sizes” ,

What is the real and or practical impact of these unequal sample sizes? I know the choice of subgroups can change estimates of wityhin sub group variation and what is considered stable or unstable.

This may be small stuff but if one of the sub groups is two and all others are three is there any real impact

Thanks,

Marty0March 15, 2010 at 10:12 pm #189762Different subgroups will change the calculation for the control limits. Did your graph happen to have the last subgroup with wider limits? 118 does not divide into equal subgroups if you use size 3. The leftover will end up with less data points and thus wider limits. Also, be careful…when you see the values for the control limits, it is calculated on the last subgroup and not the others.

0March 16, 2010 at 3:24 pm #189771

Doa

be notified via email.Hey Ken,

Nice, you got the correct icon! Yeah I am aware of the shift or change in control limits, at the right end of the minitab X bar chart there is a step showing or reflecting the “odd” sample and change in control limit; but beyond this are there any other impacts for “odd”, subgroup?

I’m just starting to get used to tha latitude that one has regarding sample size and sub group size. We just finished the ANOVA, Sample Size, Power of Test sections…..good stuff.

Thanks for your assistance….

Marty

0March 16, 2010 at 5:15 pm #189772The Xbar R Chart is not the best chart to use when you have varying sub group sizes.

I suggest you use the Xbar S chart .0March 16, 2010 at 7:04 pm #189775

Doa

be notified via email.Thanks for the response, I’m still somewhat of a green horn, why is X bar S preferable to X bar R? Are we not first supposed to look to the R chart to assess within group variation? Then proceed to the X bar chart…..

Regards,

Marty0March 16, 2010 at 8:19 pm #189776In the old days, you didn’t have a choice but to do the R chart. Now it is just as easy to use the S chart. Obviously the s.d. is a better gauge of variation than the range since it uses all the data, not just the extreme in the subset. But, the same issue will occur with the varying width control limits.

0March 16, 2010 at 8:49 pm #189779

Doa

be notified via email.Thanks Ken…..

0March 16, 2010 at 10:35 pm #189785I think you have me confused with someone else. I am Darth aka Stan aka HeeBee aka Stevo aka Truth Teller aka anybody else you want. But no Ken in the mix, sorry.

0March 17, 2010 at 12:34 am #189787Don’t forget “evil Anakin” – cute as a kid, really annoying as a teenager, highly dangerous as an adult. Seems appropriate to me….

0March 17, 2010 at 3:45 pm #189801

Doa

be notified via email.Hi Stan,

either way thank you for the assistance…..no slight intended. Great site, great resource…..

Regards,

Marty0March 18, 2010 at 3:59 am #189813

Andrew Banks

be notified via email.Darth:

I understand that the sd is a better gauge than range because it uses all the data rather than just the extremes, but isn’t the range a better gauge than sd for small (<10) data sets?

0March 19, 2010 at 6:07 pm #189846

Doa

be notified via email.Darth,

As a follow up (still getting comfortable with the repertoire of SSBB tolls at ones disposal), why not just run an I-MR chart and the question of sub group is a non issue.

My own green horn observation is generally: once one learns the methods and tools, to a degree it becomes which tool does one choose? For ex. decompsoing process variation into its component parts, special cause and inherent/common cause, you could use ANOVA, or use the process capability indicies. Of course this depends on what level of precision or insight one needs.

Am I off base on this?

thanks again….

Marty0March 19, 2010 at 7:06 pm #189848Not sure if it is better but certainly easier in the days that Shewhart decided to use it. It is still possible to have these data sets even in small subgroups:

1, 2, 3, 4, 5 or 1,1,1,1,5 or 1, 1, 3, 3, 5. All ranges would be the same but sd’s might not.

0March 19, 2010 at 7:10 pm #189849I/MR is not particularly good which is why Shewhart didn’t use it as his base method. R chart does not show independence of subgroups vs an R or S chart. No real measure of within subgroup variation. Tools can be groups pretty much as a function of the question you wish to answer:

1. So, what’s going on – Descriptive statistics

2. Is anything changing over time – Control Charts

3. Are there any differences – hypothesis testing

4. Is there any relationship – correlation and regression

5. Is there a cause and effect relationship – DOEGet the picture?

0March 20, 2010 at 4:51 pm #189861On a practical level (and I am prepared here to get yelled at for killing off data), with 118 samples, why not just delete the one group of two and run the remaining data as an Xbar-S (or R, if you feel compelled to), and see what happens? You can always manually compare your one set of “missing” data and see where it would have fallen in the control chart.

0March 23, 2010 at 3:58 pm #189873

Doa

be notified via email.Hi Darth,

got it, thanks for the assistance..

Marty

0March 26, 2010 at 1:05 am #189898

Severino

be notified via email.You’ve completely missed the boat. Two words: Rational Subgroup.

