Sample Size for Kruskal-Wallis Test
Six Sigma – iSixSigma › Forums › General Forums › Methodology › Sample Size for Kruskal-Wallis Test
- This topic has 3 replies, 2 voices, and was last updated 1 year, 1 month ago by
Robert Butler.
- AuthorPosts
- January 13, 2020 at 4:08 pm #245008
WojtekParticipant@pacocapoInclude @pacocapo in your post and this person will
be notified via email.Hi,
How could I check sample size recommended for Krusla-Wallis test?
I got 13 subgroups (weight group) with 8 variables each (104 results in total). Data is non normal distributes and box plot shows no outliners.
I run Kruskal-Wallis in Minitab to find out if medians of subgroups are the same. P = 0,000, which is less than 0,05 so Ho is rejected.
Test shows that there is at least one difference among the medians in weight group.
Is there anything alse i should take into concideration?
Regards
0January 13, 2020 at 4:35 pm #245009
Robert ButlerParticipant@rbutlerInclude @rbutler in your post and this person will
be notified via email.your post is confusing. What I think you are saying is the following:
You have 13 groups and each group has 8 samples.
The data is non-normal
You checked the data using box-plots and didn’t see any outliers
You chose the Kruskal-Wallis test because you had more than 2 groups to compare.
You ran this test and found a significant p-value.
If the above is correct then there are a few things to consider:
1. You don’t detect outliers using a boxplot. The term outlier with respect to a boxplot is not the same thing as a data point whose location relative to the main distribution is suspect.
2. ANOVA is robust to non-normality – run ANOVA and see if you get the same thing – namely a significant p-value. If you do get a significant p-value then the two tests are telling you the same thing.
The issue you are left with – regardless of the chosen test is this – which group or perhaps groups are significantly different from the rest. In ANOVA you can run a comparison of everything against everything and use the Tukey-Kramer adjustment to correct for multiple comparisons. In the case of the K-W you would use Dunn’s test for the multiple comparisons to determine the specific group differences.
If you don’t have access to Dunn’s test you will have to run a sequence of Wilcoxon-Mann-Whitney tests and use a Bonferroni correction for the multiple comparisons.
0January 14, 2020 at 4:49 am #245466
WojtekParticipant@pacocapoInclude @pacocapo in your post and this person will
be notified via email.Thank you Robert for reply:
1. I-MR chart doesn’t show any suspected points.
2. ANOVA provides p-value=0,000 same as K-W. Despite the fact that:
a) each 13 groups is non-normal distributed;
b) Equal Variances test provides p-value=0,000.
Turkey Pairwise, provides info, which means are different from each other.
The last question – how could I check if samle size of 8 pcs for each 13 groups have power of 90%?
0January 14, 2020 at 8:55 am #245469
Robert ButlerParticipant@rbutlerInclude @rbutler in your post and this person will
be notified via email.To answer your question:
There are two equations one for alpha and one for beta.
What you need to define is what constitutes a critical difference between any two populations (max value, min value)
What you have is the sample size per population (8) and for the conditions you have specified you have -1.645 (for the alpha = .05) and +1.282 ( for beta = .1).
the alpha equation is (Yc – Max value)/(2/sqrt(n)) = -1.645
the beta equation is (Yc – Min value)/(2/sqrt(n)) = 1.282
subtract the two equations and solve for n.
Obviously you can turn these equations around – plug in n = 8 and solve for the beta value and look it up on a cumulative standardized normal distribution function table and see what you have.
A point of order: As I mentioned earlier ANOVA is robust with respect to non-normality. Question: with 8 samples how did you determine the samples were really non-normal? If you used one of the many mathematical tests for non-normality – don’t. Those tests are so sensitive to non-normality that you can get significant non-normality declarations from them when using data that has been generated by a random number generator with an underlying normal distribution. What you want to do is plot the values on probability paper and look at the plot – use the “fat pencil test” – a visual check to see how well the data points form a straight line.
If you are interested in seeing just how odd various sized random samples from a normal distribution look I would recommend you get a copy of Daniel and Wood’s book Fitting Equations to Data through inter-library loan and look at the plots they provide on pages 34-43 (in second edition – page numbers may differ for later editions)
1 - AuthorPosts
You must be logged in to reply to this topic.