Is it mandatory to do sample size calculation for anova or kruskal wallis test.

Six Sigma – iSixSigma Forums General Forums Methodology Is it mandatory to do sample size calculation for anova or kruskal wallis test.

Viewing 4 posts - 1 through 4 (of 4 total)
  • Author
  • #53646


    Is it mandatory to do sample size calculation for anova or kruskal wallis test. ? if yes then how we can do it in minitab.

    wil appreciate the response

    Thanks in advance
    Navin rohilla
    Black belt


    Robert Butler

    No. If sample size estimates are to be anything more than an exercise in fertility they require some kind of prior knowledge about whatever it is that you are investigating. Usually this prior knowledge would take the form of some data based estimate of either means and standard deviations or expectations of percentage differences.

    At the heart of sample size calculations is concern for the power associated with a statistically significant finding. At the begining of an investigation the focus is on finding something of significance (alpha level < .05 or .01 or whatever) – the degree of certainty with respect to the ability to repeat this finding of significance (the power) is of secondary interest.

    Consequently, significant findings in most initial/pilot studies are very underpowered. The operating philosophy is that it is better to spend money confirming a possibly significant finding (alpha level) after it is “found” as opposed to spending large sums of money just so that you can say you are extremely confident there is nothing worth investigating.

    If your investigation falls into the category of initial investigation (pilot study) and a due diligence search of the literature/in house efforts does not uncover applicable prior art then the usual practice is to use time/cost estimates to drive sample size choices.

    If you do not have meaningful prior data and if you are forced to provide a sample size (external regulation, non-negotiable demands of the 500 pound gorilla riding your back, etc) then you will have to sit down and write some creative science fiction. The easiest way to do this is to estimate the number of samples you can afford/have time to run and then cook the wild guesses of means and standard deviations or percentage differences to give 80% power for this sample.

    As for Minitab – I don’t know the program. Perhaps someone else can help.



    The short answer is no, it isn’t “mandatory”. But if you don’t, your confidence level is indeterminate. I haven’t used Minitab for several years but as I recall it did this indirectly — Give it the population size for your sample data and it calculated confidence level. You can find a simple sample size calculator from previous posts on this site. Just search for “calculate sample size”. Note that the formula is different for proportional vs. variable data. The formulas require that you know the population size (i.e. prior knowledge). If you can’t get that, then refer to Robert Butler’s advice.


    Robert Butler

    I’m not quite sure that I understand the last post to this thread. If you run an ANOVA or any other statistical test with a given block of data the results (level of significance of the difference, confidence intervals etc.) will be whatever they will be – there’s nothing indeterminate about this.

    If you have some prior understanding of the process and you want to run a check on what your efforts might uncover, given that your prior knowledge really applies to your current situation, then it is worth running a sample size calculation. One of the biggest reasons for running a sample size calcuation has to do with detectable differences.

    Part of the knowledge that comes with prior understanding of a process is the kind of a difference one could reasonably expect to see – given that nothing else has changed. Under these circumstances a sample size estimate is useful because it will give some assurance that a difference you would like to see could, in principle, be seen given the size of your experimental effort.

    However, the question that needs to be asked at the beginning of any discussion about a difference you would like to see is this: If this difference was found to be statistically significant – would anyone care? That is, if the difference exists and if we could reliably offer this difference to the market place – would anyone pay a premium for its existence. All too often, in my experience, the answer is – no.

    If you have done your homework it has also been my experience that the differences you do find are much greater than the differences you used to generate your sample size estimate. My usual recommendation concerning sample sizes is this – if you have prior data, go ahead and run a sample size estimate. Next run the MINIMUM number of points necessary for whatever test you are planning (for example – for an ANOVA this could be as few as one measure per cell) and then run that analysis before you run any additional samples. More often than not – the results from the analysis on the MINIMUM number of samples will be such that you will want to do other things besides run more samples.

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic.