iSixSigma

Minimum Number of Data Points on a Control Chart?

Six Sigma – iSixSigma Forums General Forums Tools & Templates Minimum Number of Data Points on a Control Chart?

Viewing 9 posts - 1 through 9 (of 9 total)
  • Author
    Posts
  • #241022

    tgause
    Participant

    In the book “Understanding Statistical Process Control (Second Edition),” authors Wheeler and Chambers say on page 40 “… there is no requirement regarding the number of subgroups needed before an out-of-control point can be interpreted as an indication of uncontrolled variation.” They go on to say “Thus, even if there are only two or three subgroups, any out-of-control point can be interpreted as evidence of uncontrolled variation.”

    • Am I correct in my interpretation of this to mean that even if I have only two or three data points on a control chart,  those two or three data points are sufficient enough for me to determine whether or not a process is in or out of control? (I’m sure more would be better, but if that’s all I have, is that enough?)
    • The authors say “number of subgroups.”  Does this apply also for subgroups of one? As in an IMR chart with individual data points?
    • If the answer to the first bullet is “yes,” why? The authors say the reason is “ The logic outlined above shows why this is the case.” I’ve read and reread the paragraphs before and it’s not clear to me. Can someone please explain?
    • This seems outrageous to me, only because for years I’ve been told you need a minimum of 20 or 25 data points. I think these counts come from Mintab’s suggested minimums, perhaps?

    thanks!

    0
    #241030

    Chris Seider
    Participant

    I’d be too conservative to put a control chart out with out of control actions based on 2 or 3 subgroups.

    Remember, the estimation of standard deviation improves with the number of observations so the estimated s.d. even using Rbar.

    If you are using just a few points for “analysis”, go ahead although I doubt you’ll learn much.  I just never have any process use SPC unless I expect actions or documentation when the process shows “out of control”.

    0
    #241033

    Mike Carnell
    Participant

    @chebetz I have seen this same issue repeatedly since my first control chart in 1983. I have never had a decent answer and yes I have always heard the sample size of 30.

    This is what I do. I do 25 subgroups of 5 (preferably 5).

    0
    #241035

    tgause
    Participant

    @cseider: I agree with you, Chris. I wouldn’t feel comfortable making any kind of final judgement based on 2 or 3 data points on a control chart. I would definitely do my best to get more. However, if, for whatever reason, I couldn’t get more than 3 at that time, then, according to the authors, I could go with it and make inferences about the process. Although, I’d be cautious.

    0
    #241055

    Robert Butler
    Participant

    As others have noted – more data points are better when it comes to assessing out of control and for all of the reasons already mentioned.  However, waiting until you have more (we will arbitrarily define “more” as at least 20 data points) presumes the time/money/effort needed to get more data is forthcoming.  I’ve worked on any number of processes where the total number of data points available for constructing a control chart to assess process control was 8 or less and the hope of getting any more data points in a “reasonable” period of time (say a few days or maybe a few weeks) was a forlorn one.

    These processes were batch type processes run on an as needed basis and “as needed” translated into elapsed times of 6 months or more. Since we needed to have an assessment of level of process control I used all of the usual methods for assessing variation in the presence of little data.  One of the big issues was that of the question of the in control nature of the first data point in the next cycle of needed production.  If it was “in control” we would run the process as set up.  If it was “out of control” we had no choice but to adjust the process. This was a decision that was not taken lightly because the effort required to adjust the process was involved, time consuming, and expensive.  What I found was that even with a control chart consisting of data from a single run I was able to use what I had to make accurate assessments of the process when we started running the next cycle of production.  Of course, once I had data from the second cycle it was immediately incorporated on a data-point-by-data-point basis and the chart was updated each time a new data point was added.

    0
    #241057

    Sergey Glukhov
    Participant

    The whole discussion is an example of complete misunderstanding of a control chart. A control chart with one or two dots is not a control chart. You cannot apply Shewhart 8 tests because you need a lot more dots. The minimum empirical rule in statistics is to have at least 30 data points for a very rough staticstical analysis. One point will be on the mean line without any possibility to calculate Upper and Lower control levels. So what will this show us.

     

    0
    #241059

    Mike Carnell
    Participant

    @[email protected] I don’t think there is any misunderstanding of control charts with at least 3 of the 4 people in this string. Probably 4 but I only have first hand knowledge of 3. So lets look at your statement. Here is the ASQ definition of a control chart “The control chart is a graph used to study how a process changes over time. Data are plotted in time order. A control chart always has a central line for the average, an upper line for the upper control limit, and a lower line for the lower control limit. These lines are determined from historical data.” I don’t see a requirement to do the tests. As a matter of fact I have seen very few control charts in my lifetime that use all 8 control chart rules. So basically the ability to do testing doesn’t really determine if something is or is not a control chart. That is the esoteric academic view of the world.

    In the pragmatic world 1 piece of data is better than 2, 2 pieces of data is better than 1, etc. It is a pattern. There is a reason that confidence intervals become smaller as the sample size increases. So if I run with 3 points my risk is higher. It may be higher but it is better than sitting doing nothing (still making decisions by the way). Using my 3 points reduces my confidence interval much more than doing nothing. It decreases the standard error. In this case you make a choice to throw up your hands and close your eyes and use the force to run the process or you use what you have and as you get more data you recalculate.

    This whole thing comes down to what everybody has said in one way or another – do what you want to do but understand the assumptions and risks based on your data. Same thing with putting spec limits on a control chart which was asked in a different question. Basically if this comes down to blindly following rules then nobody needs Chris, or Robert or chebetz or me or even you. You get some programmer to write a little code and then put Alexa or Siri on the line. The AI people love these types of answers. Those people who blindly follow rules are the first to volunteer to go home when AI shows up.

    Some where along the line there was the quote “You cannot improve a process that is not in control.” I have found many BB’s sitting doing nothing because they ran their control chart, it was out of control and based on that they were waiting for it to come under control before the did anything. This whole idea of using rules to handcuff yourself makes no sense. Management wants results. Telling them you did nothing because of a rule is a career limiting strategy.

    Just my opinion.

    1
    #241060

    Robert Butler
    Participant

    No, I’m afraid this discussion is not an example of a complete misunderstanding of a control chart – Wheeler and Chambers know their stuff.

    I agree with 5 data points I can’t apply all of the control chart tests but I can do this – I can get an estimate of the ordinary process variance by employing the method of successive differences, I can get an estimate of the overall mean, and I can make a judgement call with respect to the 5 data points I have as far as addressing the question of grossly out of control points.  When you have a process that generates 8 data points per run and when, at best, it is run twice a year, insisting on 30 data points before attempting an analysis means waiting a minimum of 2 years before doing anything – and that approach means you will have to run your process using the the old W.G.P G.A.* rule of thumb which is an excellent way to go wrong with great assurance.

    A check of the back of any basic book in statistics will usually reveal a t-table. An examination of that table indicates the minimum number of data points needed for a t-test is 2.  With two data points you can get an estimate of the mean and the variance and you can test that sample mean to see if it is statistically significantly different from a target.

    When you run a two level two variable design it is quite common to have a total of 5 experimental runs and it is also quite common to use the results of those five experiments to develop predictive equations for a process.  Once the equations have been built the first order of business is confirming their validity – typically this will consist of two or three process runs where the equations are used to identify optimum settings.  If the equations do a decent job of prediction (and they do this with appalling regularity) those equations will be incorporated in the methods used for process control.

    In short, the empirical rule of needing 30 data points before running a rough statistical analysis has no merit.

    *Wonder, Guess, Putter, Guess Again

     

    1
    #241118

    Mark Graban
    Participant

    A few thoughts…

    Mike Carnell wrote: Some where along the line there was the quote “You cannot improve a process that is not in control.”

    I agree with Mike that you can’t sit and do nothing. You can improve the process by working to eliminate the causes of the out-of-control points!

    I also agree that Robert Butler is on the right track… Wheeler teaches that FOUR data points is the minimum for having a useful XmR chart (a.k.a. “process behavior chart”). Having MORE data points, of course, increases the validity of the lower and upper limits, but I’d rather use a PBC for just four or five data points instead of reacting to every up and down in the data.

    Wheeler teaches (and shows the math that says) that there are diminishing returns… having 30 data points in your XmR chart baseline is marginally better than 20. Having 15 or 20 data points is FAR better than just 4 or 5 or 10.

    0
Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic.