iSixSigma

Data points.

Six Sigma – iSixSigma Forums Old Forums General Data points.

Viewing 12 posts - 1 through 12 (of 12 total)
  • Author
    Posts
  • #32086

    Dutta
    Member

    How many data points (5,10, 20 or 30 ?) are needed to have statistical significance?

    0
    #85270

    marklamfu
    Participant

    No. of statistical data is related to the character of the case. If you want to study process capability, 30 data are minimum, generally, 100 or more data are perferable. For X_bar control chart, you can get 5 data as a subgroup to plot chart, for variable sampling plan(e.g. Mil-std-414), we can get 5, 10, 15, 20… samples based on Lot Size&AQL to calculate and determine the Lot’ acceptance or rejection 

    0
    #85274

    Dutta
    Member

    Where is the minimum “30” number derived from to calculate process capability? Need help!

    0
    #85415

    Hariprasad
    Participant

    The number 30 came out from the CLT (Central Limit Theorem)Theorem which says if the sample size is 30 then what ever may be the population distribution the sample means follow normal distribution.

    0
    #85416

    Mike Nellis
    Participant

    Hello Suman,
    The error around your mean equals standard deviation devided by the square root of the sample size (mean error = sigma/(n^1/2)).  Quadrupling the sample size will decrease the width of your confidence interval by half.  So, if there is a magic minimum sample size it is determined by your willingness/comfort level to have large uncertainty/confidence interval about your mean.  Don’t forget one of the most important aspects of sampling from populations, getting the data randomly.  Practically I tend to take as large a sample size as my budget  and time allows.  In my experience, this is more often than not more than enough.  It is better to be approximately right rather than precisely wrong.
    Hope this helps,
    -Mike 

    0
    #85430

    Bob Johnson
    Participant

    Hi Suman!
    My experience is that the sample size decision is technically driven by several considerations.  These are data type (attribute or variable), variance of the underlying population (value, known/unknown), the necessary power as well as the applicable test that I am performing.  For example, suppose you are running a 2 sample t test with a power of .95 (alpha = 0.5) and a known historical standard deviation of 2.  If we want to detect a difference between the two tested sample means of 0.5 then we will need 417 samples to meet the power requirement.  if we only need to detect a difference between the two tested sample means of 1.0, then we will need 105 samples.
    As a general rule (after all has been said and done) I follow the same considerations as Mike is his post.  Usually I have to limit my samples for economic reasons and end up either calculating the difference that my test will “see” (and understanding that any lesser difference will be subject to Type II error) or going in with a target difference and finding out the resulting power based on the data characteristics.  If the power is too low, I use this to justify more samples or scrap the test.
    Hope this helps!
    Bob Johnson

    0
    #85437

    Ropp
    Participant

    Bob –
     
    You wrote “For example, suppose you are running a 2 sample t test with a power of .95 (alpha = 0.5)”
    This seems to imply that power is 1-alpha. The rest of your post reveals you must understand the concepts pretty well.
    You probably meant power of 0.95(Beta = 0.05, alpha = 0.05) ? Could be a little confusing for those new to these things. Many I ghave encountered seem to believe that alpha has something to do with power, when in fact it does not.
    Alpha could be 0.05 with power at 0.95 as is customary or most other values that you might choose.
     
     

    0
    #85440

    Savage
    Participant

    According to some information I read in Wheeler’s Advanced Topics in Statistical Process Control, He is quoting Shewhart in the following:
    “… It appears reasonable, therefore, that the criterion [control limits] may be used even when we have only two subsamples of size not less than four.” (Wheeler quotes this from p. 315 of Shewhart’s Economic Control of Quality of Manufactured Product.
    Wheeler goes on to say, “So when limited amounts of data are available, go ahead and calculate control limits, and then, if and when additional data becomes available, recalculate the limits.”
    I have heard the expression, more data is better, but less will do. I am of the opinion that if you have only 10 or 20 samples, you can calculate the control limits. You should recognize that there is more error with fewer samples instead of 30 or more samples.
    Matt
    http://www.pqsystems.com

    0
    #85443

    Bob Johnson
    Participant

    Dave,
    Exactly right!  Sorry about the error….
     
    Best Regards,
    Bob

    0
    #85452

    Brad
    Participant

    Isn’t there a consideration for what type of statistical calculation you will be performing?  You said “how many data points (5, 10, 20 or 30)” are you referring to control chart data points or hypothesis testing (f test, t test, etc.)?  The ‘rule of thumb’ I use is that for control chart testing I need atleast 20 data points (these can be 20 individual data points or 20 subgroups of ‘x’ data points averaged within each subgroup).  For hypothesis testing I use at least 40 data points (typically individual data points) as a ‘rule of thumb’.

    0
    #85522

    A.S
    Member

    Hi Bob,
    “My experience is that the sample size decision is technically driven by several considerations.  These are data type (attribute or variable), variance of the underlying population (value, known/unknown), the necessary power as well as the applicable test that I am performing.  For example, suppose you are running a 2 sample t test with a power of .95 (alpha = 0.5) and a known historical standard deviation of 2.  If we want to detect a difference between the two tested sample means of 0.5 then we will need 417 samples to meet the power requirement.  if we only need to detect a difference between the two tested sample means of 1.0, then we will need 105 samples.”
    Could you explain as to the math behind arriving at a sample size or 417/105 based on the level of difference between the two samples.
    Thanks.

    0
    #85527

    Bob Johnson
    Participant

    Hi Subbu!
    I’ll have to research the underlying equations for you but I use minitab (V13.32) to do the calculations for me. 
    Best Regards
    Bob Johnson

    0
Viewing 12 posts - 1 through 12 (of 12 total)

The forum ‘General’ is closed to new topics and replies.