iSixSigma

Robert Butler

Forum Replies Created

Forum Replies Created

Viewing 100 posts - 2,401 through 2,500 (of 2,532 total)
  • Author
    Posts
  • #85941

    Robert Butler
    Participant

    To Dave’s point-pp.22 of Draper and Smith Applied Regression Analysis, Second Edition states with respect to regression analysis “Up to this point we have made no assumptions at all that involve probability distributions. A number of specified algebraic calculations have been made and that is all.”
    On that same page and the one following they are very clear about the fact that assumptions about normality apply to the residuals only.

    0
    #85757

    Robert Butler
    Participant

    The geometric mean is an integral part of the Box-Cox transform.  In order to compute the geometric mean you must take the log of all of your numbers and the log of a negative number is undefined.  I’m not sure what you mean by your question about non-normaility as numbers approach zero.  If you have a unilateral tolerance where zero represents perfection then as you drive your process closer and closer to that limit the only skew (error) that can occur is to one side and your data will, in general, become more and more non-normal the closer you approach zero as a limit.

    0
    #85755

    Robert Butler
    Participant

      Neither your X’s nor your Y’s need to be normal.  Issues surrounding normality only arise when attempting to assess the significance of the X’s relative to the Y’s and then these issues only apply to the distribution of the residuals.  The first chapter of Applied Regression Analysis by Draper and Smith (pp.17 of the 1st edition) has a very good summary of this point.

    0
    #85658

    Robert Butler
    Participant

      Your proposal looks to be in agreement with Chapter 8 – Capability for Non-Normal Data in Bothe’s book Measuring Process Capability.  From that standpoint it would appear that you are correct.  According to Bothe (pp.438) the drawbacks are as follows:
    “However, on account of the units of measurements associated with this measure (the equivalent 6 sigma long term spread) the equivalent 6 sigma long term for one type of process cannot easily be compared to the equivalent 6 sigma long term of another.”
    ” Another disadvantage is that meeasurement data must be plotted on NOPP to determint the percentiles, which does require some time.”

    0
    #85542

    Robert Butler
    Participant

      If your project was done properly then even if you didn’t get the dollars to the bottom line, successful product/process rollout, etc. you did gain something that is often not  the case at the end of other failed efforts-you have carefully verified and documented a number of approaches, ideas, methods, etc. that won’t work within a given framework and you know why.
      This isn’t nearly as exciting as a success that drops to the bottom line but if the lessons and the analysis are kept in mind it will eliminate a lot of future attempts at re-inventing a square wheel.  It will put everyone in the position of being able to ask very intelligent questions about any proposed future changes-specifically, if someone attempts to revisit one of the issues, you and your team members are in the position of being able to present what was done and then ask what, if anything, the future proposed effort will do differently.
      About 10 years ago I was part of a team that had done all the right things, -pareto chart, VOC, cause and effect, carefully crafted and executed experimental design, …everything…and the result was absolutely nothing.  What we did do was demonstrate that, as the process then existed, a lot of closely held beliefs simply were not true.  In addition we demonstrated that several “key” variables weren’t.  Time went by and about 5 years later a new manager with a new team tried to put together a similar study.  I was no longer working in that department but I had kept a well documented copy of our work.  I invited myself to the group meeting and presented our findings.  The whole point of my presentation was to ask what had changed.  As it turned out, not much had and instead of launching a new team down the same path, a much smaller group went out, confirmed our old findings, and then got down to the business of really trying to figure out what was wrong with the process.  Their efforts took them in directions that no one had imagined and about 8 months later they had a real success.
      Its been said that “there is nothing worse than the sight of a beautiful theory murdered by a brutal gang of facts” but if you are witness to such an event you are at least in the position of knowing how the murder was committed and you may be able to prevent another like it in the future.

    0
    #85502

    Robert Butler
    Participant

    Thanks for the tip Stan. I’ll check it out.

    0
    #85498

    Robert Butler
    Participant

    Stan,
     I’ve heard of the method but I don’t know anything about it.  If you could either provide a “how-to” description of the technique or recommend a good paper/reference I would be interested in learning more. 

    0
    #85494

    Robert Butler
    Participant

    If we call the seven factors A,B,C,D,E,F,G then for a 16 point design the alias structure for the two way interactions is:
    AB + CE + FG + ACDF + ADEG + BCDG + BDEF + ABCEFG
    AC + BE + DG + ABDF + AEFG + BCFG + CDEF + ABCDEG
    AD + CG + EF + ABCF + ABEG + BCDE + BDFG + ACDEFG
    AE + BC + DF + ABDG + ACFG + BEFG + CDEG + ABCDEF
    AF + BG + DE + ABCD + ACEG + BCEF + CDFG + ABDEFG
    AG + BF + CD + ABDE + ACEF + BCEG + DEFG + ABCDFG
    BD + CF + EG + ABCG + ABEF + ACDE + ADFG + BCDEFG
    so AB,AC,AD,AE,AF,AG,and BD are clear of one another.
    For a res IV 32 point design the alias structure for the two way interactions is:
    AB + CDF + DEG + ABCEFG
    AC + BDF + AEFG + BCDEG
    AD + BCF + BEG + ACDEFG
    AE + BDG + ACFG + BCDEF
    AF + BCD + ACEG + BDEFG
    AG + BDE + ACEF + BCDFG
    BC + ADF + BEFG + ACDEG
    BD + ACF + AEG + BCDEFG
    BE + ADG + BCFG + ACDEF
    BF + ACD + BCEG + ADEFG
    BG + ADE + BCEF + ACDFG
    CD + ABF + DEFG + ABCEG
    CE + FG + ABCDG + ABDEF
    CF + EG + ABD + ABCDEFG
    CG + EF + ABCDE + ABDFG
    DE + ABG + CDFG + ABCEF
    DF + ABC + CDEG + ABEFG
    DG + ABE + CDEF + ABCFG
    So the six two way interactions that are not clear of each other are CE, CF, CG, FG, EG, EF.  However, CE, CF, and CG (or FG, EG, and EF,) are clear of all of the other two way interactions so the number of two way interactions that are actually clear of one another (as defined for the 16 point design) should be 18 and not 15 as I stated in my first post. 
      Since you can assign any variable you want to the coded letters “A”, “B”, etc. you do have some latitude with respect to defining the interactions that you wish to examine.

    0
    #85489

    Robert Butler
    Participant

      For a 16 run res IV design with 7 factors at 2 levels you will have all main effects clear of all two-way interactions and at most seven specific two-way interactions clear of one another.  All other two-way interactions will be confounded with one another. If you choose to go for a 32 run res IV design with 7 factors at two levels you will have all main effects clear of all two-way interactions and 15 of the 21 possible two-way interactions clear of one another. The other six two-way interactions will be confounded with each other.  Thus, increasing your runs from 16 to 32 will increase the number of two-way interactions that you can examine.

    0
    #85436

    Robert Butler
    Participant

    I would have to agree with SAM concerning equating VOC with innovation and I would also have to insist that VOC has been a driver of many inventions.
    James Watt – the Newcomen engine was a standard fixture at the mines and Watt first became aware of the engine and its drawbacks when he repaired a model of the engine.  At that time there were lots of people who were interested in improving the Newcomen engine and high on the list of improvements was the need to reduce its fuel consumption.  Watt realized that the engine was wasting tons of fuel with a constant (and un-needed) cooling and reheating of the cylinder.  The impact of VOC in this instance is quite clear.
    Bell – His case is different in the sense that while he was focused on the VOC (the need for improvements in the multiple telegraph) his understanding of the process, coupled with his interest in developing mechanical devices to permit deaf people to understand the spoken word, put him in the position of being able to go well beyond the more limited and immediate needs of telegraphic systems.
      Edison – Menlo Park was a prime example of using VOC to drive innovation and invention.  Edison’s approach to invention and innovation was very straight forward.  Find out what the customer wants, generate  ideas that may satisfy those wants, put a proposal together, get financial backing, and then and only then, go out and invent the thing. 
      The main point of all of this is that method and reason do not stifle creativity.  There are many people who views of creativity can best be summed up as dabbling and drifting.  Back in 1910 A.G. Webster had the following to say about that view:  “In matters of scientific investigation the method that should be employed is think, plan, calculate, experiment, and first, last, and foremost, think.  The method that is most often employed is wonder, guess, putter, guess again, and above all avoid calculation.”

    0
    #85173

    Robert Butler
    Participant

    AQL’s are usually determined by the customer.  In my field (areospace) they are driven not by type of defect but by characteristic definition.  For all of our customers the most severe characteristic definition is “critical”. “Critical” is defined as a characteristic whose non-conformance will result in a failure of the performance of the manufactured item.  Any characteristic that meets this requirement has a stipulated inspection frequency of 100%.
      After “critical” there are other categories which go by various names depending on customer or supplier.  Two fairly common labels are “Major” and “Minor”.  While the labels are common the definitions aren’t and with each customer/supplier the AQL’s vary along with the definitions.  It is not unusual to have an AQL for one individual’s “Minor” characteristic which is higher than someone else’s “Major” characteristic even though the form, fit, and function of the two parts are similar.
      So, the short answer to your question is that within a given industry there may be an agreed upon set of defect definitions and associated AQL’s but to the best of my knowledge there is no general connection between AQL’s and product properties or defects.

    0
    #84678

    Robert Butler
    Participant

      When running a correlation analysis the main question is data quality and not sample size.  There are many issues connected with data quality.  Some of them are questions like – Is the data representative? Does the data span the region of interest? Has the data been gathered in such a manner that the various “independent” variables are in fact independent?
      The last question plays a crucial role in correlation analysis and it is one that often brings to a halt analysis efforts involving “lots of data”. For example, if you have two variables X1 and X2 that you wish to correlate with a given response and your measurements were such that every time you increased X1 you increased X2 and every time you decreased X1 you decreased X2 you will have “lots of data” and the confounding of X1 and X2 will be such that you cannot separate their effects.  In such a case, four data points from a two level factorial design coupled with two replicates on the centerpoint will provide more information than all of the data points gathered in the manner first described.

    0
    #84636

    Robert Butler
    Participant

     Using only the information provided, about all that can be said is that there isn’t much that you can say about differences in the two models.  R2 is one measure of model adequacy and without any additional information there is no apparent difference in the statistics for the lagged vs. non-lagged model.  However as I read your post I must admit that I’m not all that certain that your question is about R2.  If you could provide more information about the regression and what you are trying to do I’d be glad to try to give you an answer to your question.

    0
    #84603

    Robert Butler
    Participant

      ANOVA can be thought of as a tool that asks the question – “All other things being statistically equal, is there a difference in population means?”  You can use a one way ANOVA to compare more than two means which don’t have equal variances.  The question that you must answer is – how unequal are your population variances?  If they are numerically different but are not statistically different then ANOVA should give you a worthwhile summation of the population mean differences.
      If the variances exhibit enough inequality so that one or more are significantly different you can still use ANOVA and it will probably indicate that there are no significant differences between population means which, in light of the unequal variances, would be the case.  In the latter case your investigation should shift to asking questions concerning the “why” of the significant differences in population variances.
     

    0
    #84351

    Robert Butler
    Participant

     The simplest way to handle this is to build a regression model for taxes and then build a second one for number of rooms.  You can view your number of rooms measurement as “very granular data” and run your regression as though number of rooms was a continuous variable.  Your predicted room values will have to be rounded to the nearest integer value.
      Your residual plots for rooms will look a bit odd. If your regressors do an adequate job of predicting the number of rooms you will see stratified bands of parallel data points (assuming you are plotting the residuals against the predicted variable) all crossing the zero line at the same angle.
     

    0
    #84169

    Robert Butler
    Participant

     If you run a DOE and have done a proper analysis (normalized the X’s, checked the residual plots for trending etc.) and you find that none of your variables are significant then your design is telling you that, over the ranges of the X variables you examined, the X variables have no significant effect on the responses that you measured.  In this instance there is no best setting.  The usual practice in such cases is to set the X’s at the levels that are most convenient.  If convenience is defined in terms of lowest cost then the most convenient settings would be those that minimized raw material cost.
     In those cases where I get a result like the one you describe I check the statistics of the stepwise regression on a step-by-step basis to make sure that the level of significance of each of the rejected variables is well below my cut point.  If I have terms that would be significant between 90 and 95%  I will open up the selection criteria and let the machine run the analysis again.  I will report that nothing met my predetermined level of significance and I will present the findings of the lower probabilites of success. I ask everyone concerned if the terms that entered the model under the relaxed selection conditions make physical sense and if the “extremes” in the Y responses that are predicted with such models are physically meaningful.  If the answer to both of these questions is affirmative we usually will test the “relaxed” model with further work. 

    0
    #84125

    Robert Butler
    Participant

      Yes, ten molds and ten stations makes sense and certainly makes things more interesting.  Your problem could be viewed as a two way ANOVA with the variables being stations and molds.  If you have to view every mold and every position as distinctly different from one another then the “simplest” matrix becomes a 10 x 10 array with the response of each mold recorded at each station. If you have prior data that you can trust you may be able to fill in the matrix, identify the blank spaces and then run the missing combinations.  You could run an ANOVA on this array (it is possible to run an ANOVA with only one measurement per cell).  Assuming that you had sufficient data, the biggest problem would be the trust that you would have to put in your prior data-if the data covered a long period of time you would have to worry about process shifts unrelated to mold or position that would wind up either being attributed to mold or position or would mask their respective effects.  Of course, if you don’t have this data then you are faced with the formadible and time consuming task of taking 100 measurements.
      If you have acceptable prior data from some positions and some molds which would permit you to assemble a two way array of data on a subset of molds and positions you could run an analysis on this data.  While this wouldn’t answer the question about every mold and every position it would help you understand the magnitude and significance of differences in means and variances that you could quantify. It would aid in quantifying when an apparent difference is really a difference. Since we are assuming that every mold and every position is unique, it would also act as a guide for further work.
      If you have selection criteria that permit you to group your molds and stations into classes, the problems outline above shrink and the data gathering and analysis becomes easier.  For example, if your 10 molds can be grouped into one of 4 distinct groups and your stations can be grouped in a similar fashion then you could proceed with a much reduced matrix of molds and positions. In such a case you would randomly pick a representative mold and position for each grouping and gather data on these combinations.
      If you haven’t already, I would urge you to give serious thought to possible classes of molds and positions.  I used to work in the plastics industry and I know that there are many ways to classify molds. 
     While all of the above has been couched in terms of ANOVA the data can also be analyzed using regression methods.

    0
    #84121

    Robert Butler
    Participant

      As you describe it your concern is the apparent difference in a particular mold behavior at only two out of your 10 different stations.  If this is really the case then you have a two variable two level design.
            Mold               Station
            #1                    #1
           Other Mold       #1
           #1                     #9
           Other Mold       #9
       Where the “Other Mold” is either a mold that you “know” doesn’t seem to vary or is just a mold picked at random from your group of molds.  Replicate one of the design points to provide an estimate of error and you can check for the effect of mold, station, and mold x station interaction.  If you can’t afford to run a single replicate you can use the mold x station term as your estimate of error.
      This design will tell you about mold and station differences but unless you have evidence that suggests mold #1 and the other mold and station #1 and station #9 are representative of all of the other stations and molds in your system the results of the above investigation should only be applied to the particular subset of molds and stations that you used.
     

    0
    #83884

    Robert Butler
    Participant

      The exchnge between you and Jamie gives me a better understanding of what you are trying to do. I’d like to offer the following:
      There are a couple of problems with trying to compare Bartlett’s method with the Box-Meyer approach.  Jamie pointed out some but there are other things to consider.  The first is the number of degrees of freedom associated with each of the standard deviations.  Bartlett’s test is not very robust with respect to lack of normality and if you only have a few degrees of freedom for each standard deviation it would be very easy for Bartlett’s test to be fooled by your data. 
      The second is the fact that you really should choose a test method before you begin your analysis.  As any statistician will tell you if you try enough tests on a data set you are bound to find one or more that will give you a significant result.  Since you began with the Box-Meyer approach I would recommend that you go with the results that you had from that test.  As Jamie noted, if you haven’t already done so, it would be worthwhile doing the Box-Meyer analysis graphically as well as numerically just to see how close to a significant difference you might have-that is compare graphs and compute the exact levels of significance.  If you have a situation where you are 92% confident of a difference but not 95% confident you should certainly consider the possibility that there really is something there and that perhaps, as Jamie noted, the lack of 95% confidence may be due to nothing more than the degrees of freedom that you have available for a test.
      However, if you are determined to try another test I would suggest the Foster-Burr test instead of Bartlett’s.  It is very robust with respect to non-normality and it does very well when the degrees of freedom associated with each standard deviation are 10 or less.

    0
    #83859

    Robert Butler
    Participant

       Your initial post gives the impression that you are trying to determine the number of replicates of an experimental design needed to assess the effects of DOE variables on measured responses by using a Box-Meyers analysis.  This doesn’t make sense.
      If you have an unreplicated design (or one with no replicates except at a few chosen points) and you are interested in assessing the impact of design variables on response variance then the Box-Meyers approach for assessment of dispersion effects is the method of choice and it is run about the way you described in the second part of your initial posting:
    From Understanding Industrial Designed Experiments-Schmidt and Launsby
    Box-Meyer
    1. Use a Res 4 or higher design to avoid confounding interaction and dispersion effects.
    2. Fit the best model to the data.
    3. Compute the residuals
    4. For each level of each design factor, compute the standard deviation of the residuals.
    5. Compute the difference between the two standard deviations for each factor.
    6. Rank order the differences from #5
    7. Use a pareto chart, normal probability paper, or log of the ratio of the high and low variances of the residuals of each factor to determine the important dispersion effects.
    8. For factors with important dispersion effects, determine the settings for least variation and set location effect factors to optimize the response average.
    9. If there is a conflict of settings, use a trade-off study to determine which setting is most crucial to the product quality.

    0
    #83799

    Robert Butler
    Participant

      If it is really a matter of comparing the averages of two sample populations then a two sample t-test with unequal variances and unequal sample sizes would be the statistical tool of choice (Minitab gives you these options.  If you don’t have Minitab then pp.299-304 of Statistical Theory and Methodology 2nd Edition by Brownlee has the details for manual computation).  It’s worth checking for data normality but you should remember that the t-test, like ANOVA, is very robust to non-normality.
      What concerns me is that the wording of the original post suggests that the situation is not that of comparing populations but comparing populations of averages.  The sentence “33 daily avg. cycle times were collected (x-bar equals 6.37days)” suggests that 33 averages were gathered and the average of these averages was then computed.  If this is the case then the way to check for process changes would be to go back to the averages, note their associated ranges and plot Xbar R charts for before and after and compare these to see if there has been a change in process.

    0
    #83686

    Robert Butler
    Participant

    Process Improvement Spirits (also known as the 5 improvement spirits)
    1. Challenge all fixed ideas
    2. Do it now, no excuses
    3. Use your wisdom, not your money
    4. Get to the root cause by asking why 5 times
    5. Improvement is infinite, better is not good enough
     
     
     

    0
    #83659

    Robert Butler
    Participant

      If you are dealing with a unilateral tolerance and your measure is non-normal (e.g. taper, flatness, surface finish, concentricity, eccentricity, perpendicularity, roundness, straightness,squareness, weld bond strength, casting hardness, shrinkage, hole location, parallelism, etc) then you will need to use the following:
    Equivalent P’pu = maximum{ (USL-T)/(x99.865-x50), (USL-x50)/(x99.865-x50)}
    where
    x99.865=the value at the 99.865% point of the data when plotted on normal probability paper
    x50 = value at the 50% point of the data when plotted on normal probability paper.
    T = target of the process average
    For further reading you should check Bothe – Measuring Process Capability pp.445-447

    0
    #83527

    Robert Butler
    Participant

      The linear congruential method is also known as the simple multiplicative congruential method.  The basic method  generates a series of pseudo random numbers using residuals obtained from division.  The problem with the method is that the random numbers will exactly repeat after a given number of computations.  How long the string of random numbers is before it repeats is a function of the size of the initial seed number, the choice of the modulo, and the choice of the multiplier. 
      It was only when attempting to study the generation of random 3 dimensional points that serious problems with the cycling of this method was discovered. The first attempt at correcting this problem involved work done by Marsaglia in 1968. Marsaglia and Bray built a composite generator that was a mix of three congruential generators- hence the mixed congruential method to correct this problem.
      Unless you are trying to generate 3 dimensional random numbers the pseudo random sequence that you get from the linear method will probably be fine.  The program probably has warnings concerning the minimum size of the seed, multiplier, etc. and if you observe those cautions you shouldn’t have any trouble.
      If you are concerned about the sequence that your program generates there are a number of tests for randomness that you can apply to the final sequence.  If you are going to test the sequence for randomness you will need to decide on the test beforehand.  This is necessary since a sample of numbers, no matter how random, cannot be expected to pass a large number of statistical tests at an arbbitrary level of significance.
      If you want more details you might check pp.20-22 of Quality Control and Industrial Statistics by Duncan and pp.354-355 of Methods for Statistical Analysis of Reliability and Lif Data by Mann, Schafer, and Singpurwalla.

    0
    #83480

    Robert Butler
    Participant

    You should check chapter 13 of Bothe’s book Measuring Process Capability. With machine tool wear you are up against the problem of measuring capability in the presence of repeating trends. You can, of course,eliminate the trends by upping the frequency of tool sharpening but if you do that then you are increasing quality and increasing cost-which isn’t the idea of process improvement.
      Your estimates of short term capability won’t differ that much from the standard Cpk computations (assuming tool wear or the effect being measured is normally distributed), however, long term capability measurements like Pp and Pk will be impacted by the mean shifts that will arise from tool wear and it is recommended that you not compute these. For the cases of a moving average there is a measure Cpk,mod which can be computed instead. You also need to check the normality assumptions for tool wear. The data that I’m dealing with in this venue does not exhibit normality.
      If you find that you are up against some of the things mentioned above then the biggest problem that you will encounter once you have sorted out all of the capability computation issues is that of acceptance. Your modified computation of capability will differ from the standard Cpk computations and you will have to deal with the small army of people who seem to think that there is one and only one way to compute capability and that that computation applies to everyone for all time.

    0
    #83417

    Robert Butler
    Participant

      Subtracting or adding or dividing or multiplying your data by a constant or any combination of the latter will not change your distribution ( we are, of course, assuming that you won’t try to multiply or divide by zero).
       To illustrate this idea think of the way you transform a general normal distribution into a standardized normal distribution. You take each value and subtract off the mean and then divide this result by the standard deviation.  The result is a normal distribution with mean zero and unit variance.  The distribution has been centered and scaled but its shape and properties remain unchanged.

    0
    #83362

    Robert Butler
    Participant

      When you center and scale your regressors you are guarding against roundoff errors which can completely invalidate your results.  Even with double precision and all of the other safeguards of the current computer systems roundoff error can wreak havoc.  The differences due to roundoff error will cause different regression programs to give wildly different estimates of parameter coefficients for the same set of data.
      There are two major causes of roundoff error.  The first is the situation where the numbers in the regression computations are of widely differing orders.  X1 for example is varied between .001 and .005 and X2 is varied between 200 and 300. 
      The second case occurs when the matrix to be inverted is very close to being singular.  This usually doesn’t happen in a DOE unless you have a situation where some of your experiments failed and you have to run the analysis without the benefit of all of your design points.
      If you wish to read more on this subject you can check pp.257-262 of Applied Regression Analysis, Second Edition by Draper and Smith.
     

    0
    #83332

    Robert Butler
    Participant

      If you are looking for articles that provide the assumptions underlying the 1.5 figure as well as a discussion of the concept, the paper you want to read is the one by Bothe.  You can find a listing of his paper and the other three papers that have been written on this subject in the discussion thread “Is There Any Empirical Evidence to the 1.5 Shift” which ran on this site in October 2002.

    0
    #83158

    Robert Butler
    Participant

      Plotting data on normal probability paper does not transform the data in any way.  All that you are doing when you plot the cumulated distribution on the paper is giving yourself a sense of the location of your case percentages.  The straight line on normal arithmetic probability paper is primarily for reference purposes-that is, if your plot falls on the straight line there is a pretty good chance that your data is normal.
      If you want to check this run a normal plot of the following data:
    1,1,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,5,5,5,6,6,7,7,8,9,14
      Then take the log of this data and run a normal plot of the logged values.  The first plot will form an arc relative to the straight line whereas the second will hold fairly close to the straight line.
      As for equivalent Cp if you are interested in long term then there is an equivalent Phatp which is expressed as
       Tolerance/(Equivalent 6 sigma Long Term Spread)
    where the equivalent 6 sigma is computed by taking the difference of xhat(99.865) amd xhat(.135). pp. 434-441 of the Bothe book provides additional information.
     

    0
    #83098

    Robert Butler
    Participant

    If you have prior seal strength data take a look at it yourself and select a subset of the data that is approximately normal (seal strength is one of those properties that is bounded – the seal strength cannot exceed the strength of the material – so, depending on your current process control, your data may be very non-normal) and plot it manually on a large sheet of graph paper.  (If you want them to see the histogram evolve-print your pre-selected set of data in the form that it would appear after having been recorded on the shop floor and ask someone to read the numbers to you while you plot them in front of the class-it won’t take that long to plot 30 or so data points and it will let everyone see just how it was generated)
      Run a simple bean count and show them how many data points of the total population are contained in the regions below 0,1,2,and 3 standard deviations from the mean.Take this thought and turn it around and point out to everyone what this would mean if the lower spec limit were located at these points.  Remind everyone that when a machine is set up for a run you are trying to hit the target (mean) the first time. Point out that if you have done the best you can and the first piece comes back with a value that is exactly on the minimum this would suggest that if your process is behaving as it has in the past then there is an excellent chance that 50% of the run will be rejected.  From this it is an easy jump to justifying the requirements for a first piece trial.
      If you use graphs to illustrate each of the points as you move through them you will probably get your point across.  I’ve used this approach with children as young as 9 years of age and it has worked well.

    0
    #83043

    Robert Butler
    Participant

    Hole location along with a host of other characteristics such as taper, flatness, squareness, shrinkage, etc has the property that it is bounded by some physical limit.  In the case of hole location, you can’t have a location that is less than zero.  As was noted in another post, your limit is zero-perfect location every time.  As you move toward the limit the distribution of the measurements gets more and more skewed.  The lower tail cannot go below zero but the upper tail has no upper limit.  For distributions of this type Cpk does not apply since the distribution violates the fundamental  tenet of normality. 
      In order to compute an equivalent Cp you will have to take your data, plot it on normal probability paper, identify the values corresponding to .135 and 99.865 percentiles (-3 and +3)   The difference of these two values will be your equivalent 6 sigma spread.  Your equivalent Cp will be the tolerance (USL-LSL) divided by the equivalent 6 sigma spread.  You may very will find that this number does a poor job of characterizing the performance of the process as you know it.  In that case it might be better to express your process in terms of percent nonconforming.
      You should check Chapter 8 -Measuring Capability for Non-Normal Variable Date in Bothe’s book Measuring Process Capability, for additional information and caveats.

    0
    #83002

    Robert Butler
    Participant

      You can regress anything against anything so from a strictly mechanical standpoint you can run a regression against averages.  The more important question is what is the physical justification for doing this?  A reading of your post gives the impression that your primary concern is the R2 of the final regression model and not the utility of the model itself.  If this is indeed your concern then it would be a mistake to build a regression model on mean responses.
      A regression is nothing more than a geometric fit of a line through a cloud of data points.  Least squares regression generates a line that minimizes the sums of squares of all of the vertical differences between the data points and regression line.  In short, a least squares regression line is nothing more than a map of the location of the mean values of Y’s associated with the various values of the X’s.  If you take averages of your Y values at each X and then run the regression you will automatically see a jump in R2 and all of the other statistics associated with the fit of the regression line for the simple reason that you will have hidden most of the the vertical differences between the individual data points and the regression line.  In other words, you will have hidden your measurement uncertainty from your regression package.
     To see just how dangerous this can be try this extreme approach: Split the X values into two groups – X(high) and X(low).  Assign the arbitrary values of -1 for X(low) and 1 for X(high) Take the grand average of the Y’s associated with these X’s.  You now have exactly two data points (Xlow, Y1Avg), Xhigh, Y2Avg).  A linear regression against these two points will give you an R2 of 1 with zero error.
      If you actually have a situation where you will be making decisions on average values of your response (Y) and where your uncertainty focuses on the uncertainty of average responses and not individual Y measurements  then it would be appropriate to develop a regression model using individual X’s and average Y’s. To do this you will want to make sure that you have about the same number of Y response at each X in question.  You will also want to check the variability of the Y’s at each X . Note that there is no averaging of the X values.
     

    0
    #82987

    Robert Butler
    Participant

      There are two advantages to balanced designs.
      1. If the sample sizes are equal the test statistic is relatively insensitive to small deviations from the assumption of equal variances for the treatments.
      2. The power of the test is maximized ifthe samples are of equal size.
      While balanced designs are certainly desireable the fact is that you will find numerous situations where it will not be possible to use a balanced design. Indeed, if you do enough design work, the simple circumstance of the complete failure of one of the design points in an otherwise balanced design (with no chance for the repetition of the experiment) will force you to deal with an unbalanced situation. 
      From the standpoint of software selection I would want to make sure that the software could build and analyze unbalanced designs.  I would also want to make sure that the package could analyze my data if I had a failure of one or more of my design points. Rather than depending on machine generated warnings I would make sure that anyone attempting to use a computer to generate a design either have the training necessary to understand the issues surrounding design choice or have immediate access to someone who does. 

    0
    #82957

    Robert Butler
    Participant

      Since Cpk requires data normality it is inappropriate for data that does not have a normal distribution.  To get a measure of capability the recommended procedure is to compute an equivalent capability which is based on the .135 and 99.865 points of the normal probability distribution.
      The method is as follows:
       Plot your data on normal probability paper.  Curve fit your data points and identify the .135% and 99.865% points on the graph.  The equivalent capability is expressed as:
           (USL – LSL)/(x(99.865)-x(.135) )  = Tolerance/Equivalent 6 sigma spread.
    If you want further details check Chapter 8 of Measuring Process Capability by Bothe which is titled Capability for Non-Normal Distributions

    0
    #82897

    Robert Butler
    Participant

    Your MBB is taking the conservative approach to ANOVA and there is certainly nothing wrong with that.  However, you should know that both ANOVA and the t-test are quite robust with respect to non-normal data-in other words you can still get accurate results with non-normal data.  Pages 51-53 of The Design and Analysis of Industrial Experiments, Second Edition ,by Davies has a good discussion of the issue and the end of the chapter cites papers about this topic if you wish to read more.
       As for the accuracy of a normality test with six pieces of data…it will be as accurate as 6 pieces of data will permit it to be!  I realize that this sounds flippant, I don’t mean it to be.  If you have the luxury of gathering 20,30,40 etc. measures per population before running an ANOVA or a t-test, by all means do so but if your situation is such (as mine often is) that 6 measurements per population is a genuine strain on resources then it is better to proceed with an analysis of the 6 measurements per population than to do nothing. 
      Regardless of the amount of data you have, if you have gathered it over time and if you made a point of keeping track of when you gathered it you should first plot the data by population against time.  Often these kinds of plots will reveal much more about your process than all of your other efforts combined.
      There is an excellent article Statistics and Reality-Part 2 by Balestracci in the Winter 2002 ASQ Statistics Division Newsletter which discusses the benefits of the kind of data plotting I mentioned.

    0
    #82877

    Robert Butler
    Participant

    Woey,  the tone of your 9 February e-mail worries me. If I’ve misunderstood your post please accept my apologies in advance.  You made the comment that you “tried to do a polynomial regression, and got a pretty good y and x relationship. The R^2 is 0.998. ”  You leave the impression that you are more enamoured of R2 than the propriety of the analysis.  R2, by itself, is of little value and it is very easy to make it anything you want.  Indeed, if you have no replicated measures of results of Y for a given X you can, with enough terms, get an R2 of 1. 
      Since you are measuring individual patients you need to run your regression on individual readings.  If you build a model on average results the confidence limits around the regression will be for averages and they will be much tighter that the confidence limits for individual measurements. If you don’t take this into account you could wind up thinking that the precision of your instrument is much better than it is.
      If you are going to use polynomial regression you need to keep a constant check on the significance of your polynomial terms as you add them, the corresponding reduction in model error, the residual vs predicted plots, and of your lack-of-fit at each stage of the model building process. It’s a very easy matter to overfit. Finally, your initial missive suggested that you will need to extrapolate outside the ranges that you have measured.  You will need to carefully check the behavior of the regression equation and the confidence limits in these regions.  It is very easy to get an excellent  polynomial fit that falls apart the minute you go beyond the range of actual data. 

    0
    #82793

    Robert Butler
    Participant

      Metering systems are usually built for response over some part of a range of possible measurements.  From your description it sounds like you have the situation that Opey has cautioned against.  In addition to Opey’s comments I would also recommend against building a regression model on the averages.  Your flow measurements will be individual readings and you will want to know the ability of your model to predict those individual readings. 
      If you run your regression against the individual measurements (we’re assuming the five measurements at each point are replicate and not duplicate measurements) and then plot the fitted line and its associated confidence intervals against the individual points not only will you better highlight the regions of poor regression fit you will also highlight those regions where the sensitivity of your metering system begins to fail. 
     

    0
    #82476

    Robert Butler
    Participant

       The description of your process gives the impression that you never produce the same part twice (“Since at every run the part produced is different “).  However, whether you produce the same part intermittently or actually never produce the same part twice, you do have prior measurements of the time it takes to stabilize a production run.  With these times you can determine the distribution of the time to stability and thus identify the time at which X% of your processes will have stabilized.  You could use this number as a guide for making decisions concerning changes in sampling procedure. 
      100% inspection during a startup is certainly the safest and most conservative approach. It is possible to use other sampling methods if you have some sense of process behavior during startup. However, if you really don’t make the same part twice or there is no obvious relation between the observed extremes in product properties from one run to the next or if startup extremes can be expected to exceed the required spec limits then it would seem that the only way to avoid shipping bad product would be to do 100% inspection.

    0
    #82463

    Robert Butler
    Participant

    You might want to check the thread on this forum “Analyze Stdev or Variance on DOE’s – Marco Osuna – 11 November 2002.  The discussion covered the issues of multiple samples as well as the Box-Meyer’s approach.  In addition, one of the posts gave a “how to” for doing your own Box-Meyer’s analysis using Minitab.

    0
    #82152

    Robert Butler
    Participant

      If Jamie’s suggestion is of interest to you, a good starting point for research into the 1.5 shift would be the Davis Bothe article which appeared in Quality Engineering last year.  If you want the particulars, as well as the cites for what appear to be the only published papers that provide the empirical basis for 1.5, they are listed in a post to this forum on 11 October 2002 under the thread “Is There Any Empirical Evidence to the 1.5 Shift”

    0
    #81499

    Robert Butler
    Participant

      The special statistics based on the range were developed over 50 years ago.
      From pp.145 Quality Control and Industrial Statistics (Duncan 4th edition) is the following:
    “In 1950 P.B. Patnaik showed that the square of the average range has a distribution that is approximately of the form of a Chi Square distribution.  For the case in which the average range is the average of ranges of g random samples from normal universes with a common variance, each sample containing m cases, Patnaik worked out conversion factors d2 and degrees of freedom for various values of g and m.” 
      His original paper is “The Use of Mean Range as an Estimator of Variance in Statistical Tests” – Biometrika Vol 37 (1950) pp. 78-87
      There has been a great deal done with this since his original paper and I suspect that if you check either the internet or statistical references and look under topics such as “variance estimator” and/or “mean range use” you will probably find one or more technical historical treatments of the subject that will guide you through the finer points of the derivation of d2.

    0
    #81427

    Robert Butler
    Participant

      Unless you have a very curious process that permits the simultaneous measurement of everything at once, almost all data is gathered over some interval of time. Time, by itself, is not the issue.  If we ignore seasonal changes for a moment, the concerns that Steve W. and I are addressing have to do with the too frequent recording of data.
      A key issue of all data gathering efforts is that of independence of measurements.  If the individual measurements are not independent, the data will exhibit significant autocorrelation.  Significant autocorrelation usually means that you will compute estimates of the process variability which will be much less than the actual variation of the process.  Since variance estimates are central to tests for differernces in means, a too small variance estimate will give false indications of significant mean shifts.  Many control methods used the detection of a significant change in the mean as a basis for justifying a change in the process. Changes that are made when none are really needed only serve to increase the overall variability of the process.
      If you are going to use process monitoring data to check for significant mean shifts you should check the data for autocorrelation and seasonal changes. If you don’t have access to Minitab or other software that will permit checks of this type you can use the methods in the book by Chatfield that I mentioned in an earlier post.  There are a number of techniques that will allow you to remove the effects of seasonal changes and autocorrelation.  Once this is done, you can test for significant mean shifts in the usual manner.

    0
    #81363

    Robert Butler
    Participant

      If you think that the defects are seasonal you should first confirm this by checking such things as changing frequency of defect occurence as a function of time.  You would want to plot these against time for a visual check and then, if it appears that such is the case, you could check it formally using time series analysis.  As was recommended, in order to do this you will have to do some reading.  The book Time Series Analysis by Box and Jenkins is an often referenced text unfortunately, it does require a lot of effort to work through.  Given that you seem to have some doubt about seasonality you might want to first check The Analysis of Time Series, Second Edition, by Chatfield.  Chapter 2 is Titled “Simple Descriptive Techniques” and it may be all that you need for the present.
      Even if there is a seasonal effect the issue is still that of finding the most frequently occuring group of defects and defining their drivers.  From this standpoint, the effect of seasonality is not a major issue and, if it is in fact an effect, it may help in the identification of important defect drivers.  For example, if it turns out that your most important group of defects always occur in the summer, you can begin ruling out any kind of driver that would be limited to winter months. 
      If your situation does turn out to be as described in the above paragraph then there really won’t be any need to have to include time in your final regression equation since the model would only need to be invoked for process control during those months when the critical group of defects is known to impact the process.

    0
    #81245

    Robert Butler
    Participant

        The idea of a regression equation is to develop a correlation between one or more “causes” and an “effect” (response).  If the regression equation is to be of value it must be possible to vary the causes at will.  As described, you cannot do this.  The percentage of a given defect is whatever it is and all you can do is record it and sum it with other defect percentages to get a total defect percentage.  In order for a regression equation to be of value you would have to develop correlations between things that you can vary at will (process variables-for example) and certain kinds of defects.
       I would recommend using the raw defect counts from you current data set to build a pareto chart of the defect types.  This will permit the identification of the most frequently occuring defects.  The focus would then shift to an identification of independent factors that generate the “vital few” defect types.  Once these factors are identified it would be possible to build a regression equation relating them to the frequency of occurence of a particular kind of defect.  These correlation equations could then be combined to identify system conditions that would minimize the most frequently occuring defects and hence permit a control (and a reasonable prediction) of the total defect count.

    0
    #81204

    Robert Butler
    Participant

      This is one of those cases where the fog of the written word may cause problems.  As I re-read your posts I’m left with the following picture of your data:  You have n items that you have checked for different types of failure and different types of defects and you are looking for those defects that can best be used to characterize the failures.  If this is the case then rather than using regression it would probably be more appropriate to arrange the data in a contingency table as illustrated below.
                                                    Defect Type
    Failure Type               D1      D2     D3      D4     D5 …..
          F1                            7         12      1         0        10
          F2                            2         5       15        1         9
          F3                            ….etc.
      You would want to include “no failure” in the list of failure types and you would use the Chi squared test for your analysis. Chapter 5 of  Statistical Theory and Methodology in Science and Engineering by Brownlee (2nd Edition) has the details.  If you don’t have ready access to that book check your available statistics texts and look under the subject heading of multinomial distributions and contingency tables.
      The Chi squared test will tell you whether or not the different failure types differ in their distributions of defect types, however, it will not tell you in what specific ways the failure types differ. To understand that you will have to look at the table to note the discrepancy between expectation and observation and you will have to draw conclusions based on you understanding of the physical process.
      If  I’ve made a mistake in terms of understanding your data structure let me know and I’ll try again.

    0
    #81159

    Robert Butler
    Participant

      If you are trying to find the defects that “drive” the process of failure then you will have to start gathering data on both failure(s) and defects.  For example, when a failure is observed you will have to record the failure type (qualitative or quantitative) and note to defects that are observed when that (those) failure(s) occur.  With this kind of data in hand you will then have to examine the matrix of defects to determine which ones are independent of one another and thus which ones can be used in a regression analysis. 
      A matrix of independent X’s (types of defects) and the associated Y’s (types of failures) can then be examined using regression techniques to develop a correlation between the two. (I’m assuming that your Y’s are some kind of continuous measurements and your X’s are either discrete or continuous.) 

    0
    #81100

    Robert Butler
    Participant

      I can’t speak to opinions to the contrary but I do know that, in the statistical literature, stepwise regression is viewed as an excellent method for variable selection. From pp. 310 Draper and Smith, Applied Regression Analysis -2nd Edition we have the following:
      ” We believe this (stepwise selection with replacement) to be one of the best of the variable selection procedures discussed and recommend its use…..stepwise regression can easily be abused by the “amateur” statistician.  As with all the procedures discussed, sensible judgement is still required in the initial selection of variables and in the critical examination of the model through examination of the residuals. It is easy to rely too heavily on the automatic selection performed in the computer.”
      On those occasions when I have been asked to review a regression effort that has been the cause of a lot of disagreement what I usually find is exactly what is mentioned above-someone made the mistake of confusing the misuse of a computer package with statistical analysis. 
      “Sensible judgment” includes many things:
       1. Understanding the data origin. Design data (without failures) differs from design data with failures which differs from production data, which in turn differs from data gathered from a variety of sources.
      2. Checking the X’s for such things as confounding, qualitative vs. quantitative, same name different units (plant A measures in psi, plant B measures the same thing in mm of Hg) etc.
      3. Never relying on one form of stepwise regression.  I always run forward selection with replacement and backward elimination on every response.  Design data and even design data with failures will usually give the same model regardless of the method chosen (assuming your entry and exit conditions are identical) but there can be surprises.  Any other kind of data will more often than not give a different model for each type of stepwise regression.
     4. Plot the data-and look at the plots.  After analysis, plot the residuals and look at the residual plots.
    5. Never forget that what you have is a correlation.  You, not the equation, must provide the physical justification for the presence of terms in the model.  If the terms don’t make physical sense-investigate-maybe something is there and maybe it is just dumb luck. 
      There are other things to consider but I have found that if you pay attention to the above you will find stepwise regression to be a ver y powerful analytical tool. On the other hand, if you attempt to use stepwise regression as some form of automated analysis you will probably go wrong with great assurance.

    0
    #81049

    Robert Butler
    Participant

      The problem that you have described is the situation where you have a continuous response variable (time) and you are regressing it on a coded X variable (-1 = fail, 1 = pass).  Usually, this kind of coding is understood to represent the high and low settings of a continuous X variable but it is also used to code a 2 level qualitative variable. 
      If the t values for the beta values in your model were statistically significant then you have a significant correlation regardless of the value of the R2 and you have demonstrated that a correlation does exist. All that R2 says about your regression equation is that the model that results from this effort can explain about 50% of the observed variation in the data.  R2 is one measure of the quality of a regression but one should never accept or reject a correlation just on the R2 value (there are many areas of effort where R2 values of .2 to .3 are the best that can be expected and the resultant models are quite useful) and one should never insist on a given R2 before declaring a model acceptable.
      You may want to check Chapter 1 of Applied Regression Analysis by Draper and Smith which has an excellent discussion of the issues surrounding the significance of a regression equation.

    0
    #81022

    Robert Butler
    Participant

     Hemanth, if you are looking for appropriate reading material I would recommend Statistical Models in Engineering by Hahn and Shapiro.  Chapter 6 is an excellent discussion of the Johnson distribution which Eileen referenced.  Given that this distribution enjoys widespread use a read of Chaptrer 6 will not only give you an understanding of that method of empirical distribution fitting it will also show you the importance of skewness and kurtosis to this fitting method.
      Chapter 7 is titled “Drawing Conclusions About System Performance from Component Data” and it explains those situations where an investigator would  need a measurement of skewness and kurtosis.
      The quick take away from Chapter 7 is that these two measurements are important when you need to have an understanding of the tails of your distribution.  “…the normal distribution approximation is frequently least adequate at the extreme tails of a distribution.”

    0
    #80965

    Robert Butler
    Participant

      If you are not aware of the Davis Bothe article on the same subject in Quality Engineering 14(3) 479-487 (2002) you may want to read it before you go to print with your booklet.  Based on your description of what you have done, it sounds like the two of you have done the same thing – worked on the theory that would justify 1.5.

    0
    #80878

    Robert Butler
    Participant

    Before running this experiment make sure that your watch is stopped at a time about an hour or two before the actual running of the test.
    Have an overhead ready in advance and then announce to everyone in the room that, on your mark, they should write down the time (to the nearest second) that is displayed on their respective wristwatches.  Go around the room and ask each person in turn for their recorded time.  Save your “stopped watch” time until last.  The variation about the actual current time will illustrate common cause variation and your stopped watch will provide an example of special cause variation…after the exercise is over you will help drive home the concept of special cause if you admit that you stopped your watch on purpose.

    0
    #80874

    Robert Butler
    Participant

    From MIL-HDBK-5H  pp. 9-30 we have the following:
    ” The quantities (skewness corrections) added to the tolerance-limit factor, k99 and k90, represent adjustments which were determined empirically to protect against anticonservatism even in the case of moderate negative skewness.  Skewness is not easy to detect, and negative skewness in the underlying distribution can cause severely anticonservative estimates when using the normal model, even when the Anderson-Darling goodness-of-fit test is used to filter samples.  These adjustments help to ensure that the lower tolerance bounds computed maintain a confidence level near 95 percent for underlying skewness as low as -1.0.  For higher skewness, the procedure will produce significantly greater coverage.”
      Since an awful lot of people use the Mil Handbook I guess that the answer to your question is that a lot of people use skewness, or at least have to take it into account. 
      I can think of a few instances where I had to worry about kurtosis but these would only constitute personal use as opposed to general use of the property.

    0
    #80862

    Robert Butler
    Participant

     A number of everyday examples come to mind.
    1. If there was only a 99% chance of safely landing an airplane every time it landed – every airport in the country would be littered with wreckage.
    2. If there was only a 99% chance of safely crossing the street every time you stepped off of the sidewalk –  the most common cause of death would be death by crossing the street.
    3. If the postal service only had a 99% success rate in mail delivery there would be mountains of mis-delivered mail piled on street corners.
    4. If the local drug store only had a 99% success rate in correctly filling prescriptions there probably wouldn’t be any such thing as a drug store.
    5. If there was only a 99% chance of not getting a fatal shock from a light switch each time you turned on the lights – candles would be a very popular means of illumination and anyone who actually used light switches would be viewed as a reckless daredevil.
      The point here is, if you are looking for examples of anything where 99% isn’t good enough, just look for things that are done again and again on a daily basis and I’m sure you will be able to work out any number of examples like those listed above.

    0
    #80827

    Robert Butler
    Participant

      Since we seem to be pointing Mr. Osuna in the direction of a Box-Meyer approach and since it was mentioned that Minitab is set up to do this analysis, I sat down with a copy of Minitab (version 13.31) and tried to find the Box-Meyer method.  I have had no success locating it.  A check of the manuals that I have also fails to mention this analytical tool.
      Stan, since I understood your first post to mean that Box-Meyer is on Minitab, could you give all of us a tutorial on how to find it?  (If I am in error here please accept my apologies in advance). In addition to helping Mr. Osuna, I would like to know where this is in order to have an automated method to replace my built-it-myself version that I have over on another package.
     I’d also like to offer the following thoughts on the sidebar discussion in this thread concerning regression:
      1. The issues surrounding normality do not apply to the independent or dependent variables in a regression.  Normality is an issue with the residuals and then it is only an issue when you are attempting to use the residual mean squared error term to identify significant effects.  The Second Edition of Applied Regression Analysis by Draper and Smith pp.22 section 1.4 has the details.
     2. The suggestion to use regression methods to examine historical data is reasonable but there are a number of caveats concerning historical data that one should never forget:
       a. Variables that are known to be important to your process will probably not show up in a regression analysis.  This is because in historical data, important variables are so carefully controlled that they are never given a chance to impact the process. 
      b. Variabiles that were not part of your control process will have been allowed to change as they see fit.  Consequently, you will most likely have massive confounding of uncontrolled variables. Regression packages will not know this and they will give you regression equations loaded with variables that are not independent of one another.  In order to check the nature of the confounding of the “independent” variables in historical data you will have to run an eigenvector analysis on your data set and check for such things as VIF’s, eigenvector values, and condition indices.
     c. The confounding mentioned in b. also means that some of the uncontrolled variables will mask other variables that were not only uncontrolled but completely unknown.
      The above is not meant to suggest that you ignore historical data. Rather it is offered to emphasize problems with historical data and to help you avoid the all to prevalent problem with such data and its analysis which can best be summed up as “Garbage in – Gospel out”

    0
    #80687

    Robert Butler
    Participant

    I can only second SoCalBB’s comments.  Deming’s comment on this kind of a situation bears repeating-Be careful, the woods are full of HACKS!  …and he always put a lot of emphasis on that last word. 
      I’d like to ask you if they actually said ,”the 4 week session contains much useless statistical training on theory, the computations, statistical history, etc when everyone uses Minitab anyway.”  If they did I’ll make it a point to add this one to my hall of shame collection.

    0
    #80608

    Robert Butler
    Participant

     The requirement for “reasonably normal” data is driven by the construct of the tests.  Both the t-test and ANOVA assume that the data that you are testing consists of independent samples from normal populations.  This is the reason that there is such a focus on understanding the underlying distribution of the data before continuing with an investigation.  If you wish to use either of the above tests and your data proves to be significantly non-normal you will either have to investigate ways to transform the data so that it is “reasonably normal” and/or you will have to study tests that are equivalent to the t-test and ANOVA for non-normal data.

    0
    #80541

    Robert Butler
    Participant

    In order to use DOE to investigate the effect of variable change on process variability (standard deviation) you will need to have an estimate of variation at each design point.  This means that you will have to repeat the entire design at least once (in order to have a two point estimate of variance at each design point).  The amount of experimentation involved in such an effort can easily get out of hand.  For example, for 4 factors, a full factorial 2 level design would mean a minimum of 32 experimental runs. 
      Two point variance estimates mean that you are not going to be able to chase subtle differences in standard deviation. Consequently, you should view first efforts in this investigation as a screen for main effects.  This would recommend the use of fractionated designs which, in turn, would allow more replication (and hence better estimates of variance at each point).  With 4 variables, a half fraction would be 8 points.  You could do 3 complete runs of the design and thus have a 3 point estimate of variation at each design point for a total of 24 runs (as opposed to 32 runs for a two level full factorial giving only a two point estimate of variation at each design point). 
       Since process variation is rarely uniform across a design space (it is because of this that you try to run as many different replicates of as many different points as you can when using a DOE to examine variable effects on means) the significant shifts in variation that you will detect will highlight variables impacting the variation above and beyond the background non-uniformity of the design space.
      I’ve used the above approach several times and have found it quite useful for identifying variables impacting the standard deviation.  Of course, a DOE run in this fashion will permit an investigation of the variables impacting the mean as well.

    0
    #80435

    Robert Butler
    Participant

    To compute any defect level you wish do the following:
    1. Find a statistics book with the table for the cumulative standardized normal distribution function.  This table will be in the appendix.
    2. Subtract 1.5 from your Sigma Value  – the number that you have is called the Z score.
    3. Using the Z score, look up the associated probabilty in the table listed in #1.
    4. Multiply this probability by 1,000,000.
    5. Subtract the result obtained in 4 from 1,000,000. This value is your DPMO.
      Examples:  
                        Sigma Value = 5
                       5 – 1.5 = 3.5 = Z score
        From the normal table a Z score of 3.5 has a probability of .9997674.
        .9997674 X 1,000,000 = 999767.4
        1,000,000 – 999767.4 = 232.6 DPMO
     
                        Sigma Value = -1.2
                        -1.2 – 1.5 = -2.7 = Z score
    From the normal table a Z score of -2.7 has a probability of .003467.
    .003467 X 1,000,000 = 3467
    1,000,000 – 3467 = 996,533   Therefore your process has 996,533 DPMO
     

    0
    #80408

    Robert Butler
    Participant

    Adam,
      If you are referring to the Davis Bothe’s article in Quality Engineering – Statistical Reason for the 1.5 Sigma Shift, I think that he has provided the answer to your question.  At the beginning of the article he states that “Six Sigma advocates…offer only personal experiences and three dated empirical studies (which he cites) as justification.”  He also points out that these three studies are 25 and 50 years old.  In short, if you are looking for documented and published cases in support of 1.5 his article strongly suggests that you will look in vain.
      The real value of the Bothe article is that he defines the dynamics of a process that could give rise to such a change.  His examination of the dynamics of control chart sensitivity and use is reasonable and does make a case for process drift.  He identifies a range of drift from 1.3 to 1.7 and he also identifies the assumptions that have been made for drift to occur.  In his concluding paragraph he identifies the limitations of the initial assumptions.
      His concluding comments go to the heart of a number of the posts to this thread.  Specifically, if the dynamics of your process do not match the dynamics of the process he defined then your drift may or may not be greater or less than the 1.5 drift that is advocated by the Six Sigma community.

    0
    #80382

    Robert Butler
    Participant

    From the RICOH website we have the following:
     ” In 1951, The Japanese Union of Scientists (JUSE) developed the Deming Award for quality control based on his work.  The competition for the Medal is severe.  Judging is strict. Companies spend years in preparation before even becoming eligible for consideration.”
      The criteria for examination includes corporate policy, quality systems, education and training, results, and future plans.

    0
    #80088

    Robert Butler
    Participant

    Mike,
      I’m left with the impression that we are talking past each other.  Billybob indicated that his experience with M&M’s in training exercises dealt with defects-I’m assuming defects/bag.  All I said was that the only experiments involving M&M’s of which I was aware focused on helping people understand the concepts of distribution, variation, standard deviation, etc.
      The reference to 3rd and 4th grade students was not meant to denigrate, merely to emphasize the fact that by using something such as the M&M experiment you could make these concepts understandable to almost anyone.
      What I do find interesting is that apparently you and Billybob are aware of yet another experiment involving M&M’s that focuses on defects and introduces the concept of attribute gage R&R.  If this has been written up could you provide a citation?  If it hasn’t could you tell me how it is done?
      By way of recirpocity, if you are interested in fine tuning your teaching techniques with respect to using M&M’s, check the past issues of the American Statistical Association’s STN (Statistics Teachers Network)  newsletter.  There is a four part article “Some Students and a t-test or Two” that may give you some ideas for using M&M’s for the purposes mentioned in the first paragraph.

    0
    #80059

    Robert Butler
    Participant

      If your post is in earnest and you aren’t just having some Monday fun I don’t think the object of using M&M’s in statistics training is to emphasize defect detection.  All of the exercises that I have taught use M&M’s to illustrate the concepts of distribution and variation.  Properly done, the use of M&M’s can really help people understand the issues that arise when dealing with data.
      As I’ve mentioned on other threads, the M&M experiment is so clear and concise that you can use it to teach 3rd and 4th grade students advanced statistical concepts.  So the next time you look at a Plain M&M packet don’t think “defect” think “distribution”.

    0
    #79918

    Robert Butler
    Participant

      The purpose of any replication in a design is to develop an estimate of the error.  The purpose of a centerpoint in an otherwise two level design is to permit the detection of curvilinear behavior.  Relpication of the centerpoint allows the investigator to address both of these questions at once. 
      If you are severely constrained with respect to resources and time, a simple replication of the center point ( i.e. the center point is run twice) will permit an investigation of curvilinear behavior (via residual analysis) and an estimate of error.  If you wish to improve your estimate of error you may increase the number of replicates run on the center, however, you need to understand the assumption that you are making if you take this course of action. 
      Concentrating all of your replicates at the center means that you are assuming that the variance across the experimental space is constant.  If it isn’t constant, the estimate of error that you will get from running only multiple replicates on the center will probably be an underestimate of the variance of the space and you will wind up declaring effects significant when they really are not.
      Thus, the choice of the number of replicates of the centerpoint is driven by the aims of your experimental effort and there is really no such thing as the right number of centerpoint replicates.  If you are looking to conserve resources and time then one or more replicates on only the centerpoint makes sense.  If you have a little more latitude with respect to your investigation ,centerpoint replication complimented with replication of other points in the design space will provide you with the same check of curvilinear behavior and a more realistic estimate of experimental error.
     

    0
    #79814

    Robert Butler
    Participant

      Plackett-Burman designs, like highly fractionated 2 level factorial designs, are for screening a large number of factors in order to identify the “big hitters” – the main effect variables that have the biggest effect on your process.
      Fractionated 2 level designs are always 2 to some power. Plackett-Burman designs are based on multiples of 4.  Thus, 2 level designs increase as 4, 8, 16, 32, 64, 128, while Plackett-Burman designs go as 4, 8, 12, 16, 20, etc.  This means that if you are interested in checking out, say, 11 factors the minimum traditional 2 level design that you could build would have 16 experiments whereas the Plackett-Burman would have only 12.
      If you have adequate statistical software to permit a proper analysis of the design the analysis and interpretation of Plackett-Burman results should be no more difficult than when analyzing any other kind of screening design.
      Because it is a main effects design you will have no information concerning interactions or curvilinear behavior. Consequently, the resulting models should be used as a guide for further work and not as a final model for purposes of process control. 
      If all of the variables of interest are quantitative and the design space is such that you can have three distinct levels for each variable then one way to maximize your results and minimize your efforts would be to set up the P-B design and add a center point and run the replicate on the center point alone.  The residual analysis of such a combination will give you a clear indication of the presence of curvilinear effects in your process.  You won’t know which variable (or variables) is responsible but you will know that the effect is present and must be identified before you can claim to have an adequate model of your process.
      Before you decide to go ahead with a P-B design I can only echo what others have posted to this thread-don’t just pick a design and run it.  First consider what you want to do and then let your wants guide you in your design choice.  If your wants are such that none of the “look-up” designs meet your needs, you will need to seek the guidance of your local statistician.

    0
    #79669

    Robert Butler
    Participant

    In his post Mike Carnell re-emphasizes a key point of all statistical analysis-namely that one should always choose the statistical methods of inquiry before running the analysis-not after.  If you don’t do this then, as he points out, it is a very easy matter to cast around and find some statistical tool that will support whatever position it is that you wish to defend.
      The well known statistician, Stuart Hunter, (of Box, Hunter, and Hunter fame) refers to this kind of abuse of statistics as P.A.R.C. analysis  – Planning After Research Completed….and he points out that what you have when you analyze your data in this manner is exactly what you get when you spell P.A.R.C. backwards.

    0
    #79641

    Robert Butler
    Participant

    A P value is just the value of alpha at which a decision regarding the null hypothesis would just be on the borderline between acceptance and rejection.  It is a common practice to set alpha = .05 or .01, however, there is nothing sacred about either of these numbers.  In many fields an alpha level of .05 is unreasonable. 
      A P value of .05 means that you can reject the null hypothesis at a level of significance of .05 and a P value of .15 means that you can reject the null hypothesis at a level of significance of .15.  In other words, at .05 there is a 5% chance that the observed difference isn’t statistically significant and at .15 there is a 15% chance that the difference isn’t statistically significant.
     So, to answer your questioin, yes there is a statistical difference at a P of .15. The question that you need to address is-are you willing to live with the possibility that there is roughly a one-in-seven (actually 6.6667)  chance that you are wrong as opposed to a one-in-20 chance of being wrong if you go with a P=.05

    0
    #79593

    Robert Butler
    Participant

      You’ve asked a very good question.  The empirical evidence rests on three old studies –
    Bender-Statistical Tolerancing as It Relates to Quality Control and the Designer-Automotive Division Newsletter of ASQC 1975,
    Evans-Statistical Tolerancing:The State of the Art, Part III, Shifts and Drifts. JQT 1975, 7 (2), 72-76,
    Gilson-New Approach to Engineering Tolerances: Machinery Publishing Co., London, 1951
      In the past issue of Quality Engineering 14(3), pp.479-487 Davis Bothe pulled these together with some additional research that he had done in the article ‘Statistical Reason for the 1.5 Shift’.
      At the very end of his article Bothe points out that the 1.5 shift is reasonable but that it is based on the assumption of a stable process variance.  If the variance is not stable then the 1.5 shift may or may not occur and may or may not be larger or smaller.  He also points out that the above assume normality and that the situation may be different (or maybe not) for non-normal responses. I think the Bothe article is the best article on the subject that I have read.  His point concerning variance stability is one that has bothered me in the past.  In particular, given that you have a process that is in control and that you are making adjustments to keep it in control,  the random walk nature of such a process coupled with both a drift in the mean and changes in variance could just as likely result in a drift in the process that resulted in improvement over time.
      Thus, this rather lengthly reply to your question can be summarized by recommending that you read the Bothe article not only to understand the justification for the 1.5 shift but to also know the assumptions that have been made concerning its validity.

    0
    #79556

    Robert Butler
    Participant

    Abasu has brought up a point that has been bothering me.  In her first posting Georgette stated that “the best metric is standard deviation”.  After some postings from others, myself included, she responded with a post which included the following: “as we take 30 readings accross the part.  I could average the reading at spot #1”.  These two comments lead me to believe that what she might be concerned about is part-to-part surface variation.
      If this is the case, then examining the parts by grouping measurements made at “spot #1” across all parts or subgroup samples across parts would be a mistake since such a grouping would address variation at spot#1 (and spot#2 etc.) as opposed to the variation of the surface as a whole. Similarly, you wouldn’t want to take the average of 30 readings within a part since this would change the focus of the investigation from that of surface variation to variation of mean surface measurements.

    0
    #79505

    Robert Butler
    Participant

     You are correct with respect to your computations.  A Z score of 4.5, as you noted, does correspond to 3.4ppm.  The Six Sigma claim of 3.4ppm at a Z score of 6 is due to the simple fact of adding 1.5 to the Z score that you would get from a cumulative normal table. 
      The justification for the addition of this 1.5 factor rests on three studies that were done 25 and 50 years ago.  Probably the best (and most recent) discussion of the reasons for this addition can be found in Quality Engineering 14(3) pp.479-487 in an article by Davis Bothe-Statistical Reason for the 1.5 sigma Shift It is an excellent and very readable article. It is the sort of paper that should be committed to memory by anyone claiming any expertise in the world of SixSigma quality

    0
    #79455

    Robert Butler
    Participant

    I’d appreciate it if you would give a technical definition of “granular data”.  I am unfamiliar with this term.  I did do a search on the term and it appears to be from data mining.  Unfortunately, while there were numerous cites and sites employing the term, no definitions were forthcoming.  There was enough discussion about various issues on some of the web sites to lead me to believe that perhaps granular data is nothing more than individual data points but I would want to be sure of this before offering anything.
     The lack of response to your question suggests that I’m not alone in understanding the reference.  With a proper definintion perhaps I, or someone else, may be able to offer some advice.

    0
    #79398

    Robert Butler
    Participant

    As you describe your problem the question is not one of dependence or independence rather it is a comparison of some average response (measurement) made on two populations which may differ from one another because of the addition of a solution to one and not the other. 
      There are two possibilities here.  If you measured each sample prior to addition of the solution and then measured the solutions after the solution addition and you kept track of the measurements on a sample-by-sample basis then you can use the t-test to make a paired comparison.  If, on the other hand, you did not keep track of the measurements on a sample-by-sample basis but just simply recorded the before and after results then you will just use a two sample t-test to run your comparison. 
      Since the t-test assumes that your data is from a normal population and since it will usually assume equality of sample population variance (unless you have a program that allows/demands a decision concerning sample population variances) you will have to check to make sure that the sample variances are equivalent and you will also have to check the assumption of normality.  If the sample variances are not equivalent you will have to run a t-test with unequal variance. If the data is not from a normal distribution you will have to use another test such as the Mann-Whitney.
      You mentioned that you are going to check this difference at a specified temperature.  You need to understand that any conclusions that you draw from your analysis will only apply to the temperature that you chose.  For a different temperature you may get very different results.  If temperature is going to vary you may want to re-think your approach and consider revamping your investigation so that you could run a two way ANOVA in order to investigate both temperature and solution addition.

    0
    #79375

    Robert Butler
    Participant

      JB, that’s correct.  You will have two response columns one for the pull strength and a column for yes/no responses.  If you can only run the design once you can run a regression against just the -1/1 or 0/1 responses but you should use the results of the regression only as a guide to give you some sense of variables imapacting burn and not as a final definitive predictive equation.  If you can replicate the entire design several times you should be able to convert the yes/no responses into probabilities (i.e. say for the first experiment which was run 5 times you have as responses yes,yes, yes,no,no then you can use this to convert these responses into the probability of occurrence of a yes).
      There are a number of caveats concerning either approach which are too lengthly to go into in a single post.  If you want to read more on the subject I’d recommend Analysis of Binary Data by Cox and Snell sections 1.1 to 1.4 of the first chapter.

    0
    #79364

    Robert Butler
    Participant

    Your description of the problem suggests that there are at least two measured responses: occurrence of burn – (this either being a Yes, No response or, if there is some continuous measure of degree of burn, whatever that measure may be) and pull strength of the sonic weld.
      Your design would focus on those variables that you have good reason to believe will impact weld strength and burn occurrence. After the design is run you would build separate models for burn occurrence and for weld pull strength. You can then use these models to hopefully identify a region of minimum burn occurrence and maximum weld strength.

    0
    #79303

    Robert Butler
    Participant

      Wendy asked about the M&M experiment.  There may be more than one but the version which I have used numerous times consists of the following:1. Before the talk go out and purchase an entire BOX of M&M plain.  Don’t get the party size, get the regular, at-the-check-out-counter size.  If you ask the store clerk or manager many times they will have an unopened box of these in the back.  If you have to purchase an opened box make sure that all of the packets are from the same lot (look at the lot markings on the back side of each packet).2. Have ready an overhead slide with a grid.  Label the X axis “Red M&M’s” and the Y axis “Number of Packets”.3. Make sure that everyone has a partner and then give out a packet of M&M’s to each participant.4. Tell the groups of two to open their packets, one at a time, and count the total number of M&M’s and the total number of RED M&M’s and write the count down on a piece of paper.  Emphaize that after the first count, the second member of the team must verify the total and RED M&M count.  5. Start around the room and ask each person for their RED M&M count.  When they give it to you, plot the results on the overhead slide.6. If you have about 20-30 participants and thus 20-30 opened bags, you will have a histogram that should look pretty normal.7. Pull out a couple of unopened bags from the same material lot and ask everyone to give you an estimate of the “most probable number of RED M&M’s in the mystery bags”. 8. People will shout out numbers-write them down so that everyone can see them.  Some will give ranges (which is what you want to emphasize) and some will just pick the mode of the distribution.  9. Tear open the mystery bags and plot their results, using a different pen color, on your histogram.10. You can then go back to the estimated ranges and see if the “final product” was in or out of the suggested product range.This exercise will provide a springboard for introducing most of the statistical issues of Six Sigma.  All of the concepts are present – mean, median, mode, standard deviation, customer spec limits, etc.  From time to time I’ve had enough time to prepare for this in advance and I’ve used the experiment twice.  The first time I make sure that I’ve purchased the box of M&M’s at least 6 months before I was going to give the talk.  I then run this experiment at the beginning of the discussion with the original group of M&M’s.  I keep the first histogram and then, when we are at the end of the talk I have ready a second box of M&M’s which I had purchased just hours before.  We repeat the experiment with the second lot. Many times the six month interval has seen economic changes that have significantly impacted the total number of M&M’s as well as the total number of RED M&M’s and, in addition to talking about process mean shift, I have a ready made example of the proper use of the t-test.  If a significant shift has not occurred, I use the two histograms to discuss process stability and again, I can show the value of the t-test in comparison of means.  In the cases where I have used two different lots of material the experiments are run at the beginning and at the end of a series of training periods.  You don’t have to have a series of talks to test both lots of M&M’s but if you are going to do both lots in the same presentation you will want to have a supply of large plastic coffee cups so that people will have someplace to put all of those M&M’s.  If you take some time to plan your presentation you will be suprised at how this helps everyone understand basic statistical concepts.  I’ve had great success using M&M’s to teach 3rd and 4th grade students about statistics and the t-test.

    0
    #79260

    Robert Butler
    Participant

    If we go back to Mr. Deveix’s original question the issue is not about the standard deviation of a process it is about the sigma value that he is getting when he puts a given defect rate in his sigma calculator.  If you have the table generated by Motorola et al with the headings %Yield, Sigma Value, and DPMO and if you also have a statistics book that has the table for the cumulative normal you can set these side by side and see that the Sigma Value (my Motorola table is the one that does NOT have the 1.5 correction added to the Sigma Value) is the Z statistic and the %Yield divided by 100 is the corresponding area under the normal curve.  If you go back to your sigma calculator in whatever program you are using (mine is on Statistica and it does add 1.5 to the Sigma Value) and plug in values such as 999,999.999 for your DPMO you will get a Sigma Value of -4.49 or -5.99 depending on how your particular calculator uses 1.5 when computing Sigma Values. These Sigma Values just mean that the vast majority of your product is completely outside the customer spec limits. It says nothing about the standard deviation of your process.  Thus, as Mr. Deveix has discovered, the calculator for Sigma Values will indeed return a negative Sigma Value which is what it should do when the defect rate get high enough.
      The real problem here is one of labeling.  Someone made a very poor choice when it came to renaming the Z score.  Whoever they were they at least chose to call the Z score the Sigma Value and not Sigma. Unfortunately, the same cannot be said of any number of articles and computer programs.  I know that Statistica calls their program the Sigma Calculator and the final result Sigma.  The problems that can result from confusing Sigma Value (which can have negative values) with process standard deviation (which cannot have negative values) are too numerous to mention.

    0
    #79229

    Robert Butler
    Participant

    The problem in this discussion is that sigma has multiple meanings.  As was noted, you can’t have a negative sigma when sigma refers to the standard deviation of a process.  For the six sigma calculator the sigma value that is produced is (assuming a normal process) just the Z score.  Thus a Z score (Sigma Value) of 4.5 corresponds to .9999966 or 99.99966%.  Consequently, your “negative sigma” value is a Z score indicating that your process has something less than one half of its output in spec. 
      In this context, a Sigma Value (Z score) of 0 would correspond to .5 which would translate into 50% of your product in spec and a Sigma Value of say -.2 would indicate that only 42.07% of your product met spec.

    0
    #79153

    Robert Butler
    Participant

      I don’t know of any plotting routines that would do this.  However, I would like to offer the following thoughts concerning such an effort. 
      Giving people a sense of where they are at some particular moment is laudable, however, just knowing where you are and not knowing where you have been or where you might be going is the woodsman’s definition of being lost.  As you described it, this is exactly what your speedometer will do-location without any sense of direction. 
      A much better approach would be a time plot (not a control chart-just a time sequence plot) of the measurements over time.  The chart can have colored bands corresponding to excellent, good, indifferent, poor, and unacceptable regions.  By providing a snapshot of the process for say 3 months you will have all of the visual interest of a speedometer along with the visual value of a map.
      If the measurements are too frequent so that a point-by-point plot will resemble nothing more than a busy squiggle, try grouping the data on some convenient basis such as hour, day, week, etc. and represent it as a series of time sequenced box-plots. 
      By putting the current measurements in this context you will have a much more meaningful graphic which will help promote independent thinking about your process. 
      If you wish to consider meaningful ways of representing your process you might want to read the three books by Tufte – The Visual Display of Quantitative Information, Visual Explanations, and Envisioning Information.  They are very readable and all three are packed with example after visual example of ways to present your data.

    0
    #78917

    Robert Butler
    Participant

      While 15 variables may be a result of a poor effort in the measure phase of a project it is also quite possible that 15 is the minimum number. 
      If you have no knowledge of statistics and the power of designed experiments you will find it very difficult to deal with much more than 4-5 variables.  Under such circumstances it can become an article of faith that there can’t be more than 4-5 critical variables in a process.  When the no-more-than-five-variables barrier is removed all sorts of interesting things come out of the woodwork and people start to tell you what they REALLY think is impacting the process. Having done this sort of thing for 20+ years I have found that even after doing all of the things that are lumped under Measure in DMAIC it is not uncommon to have a final variable list in the 10-20 range.
      The benefits of a design that investigates the top however many (even if some of them are questionable) far outweighs any other concerns.  A screen that checks them will not only aid in an identification of the really important variables it will also lay to rest a lot of myth, misunderstanding and disagreement about the process itself.

    0
    #78873

    Robert Butler
    Participant

     Replication does not necessairly mean that you have to replicate an entire design.  When you are interested in screening for variables that have a major impact on your process and/or when the experiments are too costly or time consuming to run, you can opt for running a saturated design with replication restricted to only one or two of the points in the design.
      Based on my understanding of your problem it would appear that a way to address the issue of screening 15 variables would be for you to generate the usual 2 level full factorial design for 4 variables and then assign the other 11 variables to the second, third, and 4th level interactions.  If the variables are such that there is a natural center point for each one then I would recommend adding a center point to this design and then replicate just the center point. This would give a total of 18 runs.  If there isn’t a natural center point (a type variable instead of a continuous variable, for example) then toss the numbers 1-16 in a hat and draw the number of the experiment that you will replicate.  If you can afford to replicate more than one experimental condition then do so. 
      It is understood that this approach will not give you the degree of precision with respect to an estimate of your error as would a complete replication of the entire design but it is statistically sound. Indeed, replication of just the center point for an estimate of error is standard practice with full or fractional composite designs
      This approach will  permit a check of the 15 variables of interest while keeping the cost of the experimental effort within your budget.
      If you want more information on nonreplicated experiments I would recommend looking at Analysis of Messy Data-Vol 2 – Nonreplicated Experiments by Milliken and Johnson

    0
    #78853

    Robert Butler
    Participant

      In addition to following Hermanth’s suggestion for tracking individuals I would also recommend that for the two month test period you track and plot the arrival times and quantity per arriving batch.  If you just track daily time you are assuming that there is no difference between a batch of one and a batch of ten and you are assuming that something arriving at 8:30 A.M. will have the same chance for initial touch time as something arriving five minutes before lunch break or fifteen minutes before quitting time.  Granted that gathering this additional data may be cumbersome and/or inconvenient but if you make it clear to everyone that this data gathering effort has a reason and has a definite “drop dead” date you will probably have little difficulty gathering it.  The combined data will permit a check for trending in waiting time as a function of arrival time and batch size. 
      If the check reveals no connection well and good.  If it does, it will help guide you in making appropriate changes to your plan.

    0
    #78763

    Robert Butler
    Participant

      Based on your posts it sounds as if your primary interest is in having some form of graphical representation of your data to give you some idea of where you have been and a sense of where you might be going rather than trying to control anything in particular.  Given that you are taking your data in monthly blocks, a much better way to view the results of your process would be to run monthly boxplots and plot them over time. 
       Your posts lead me to understand that your data may not be normal and that the issue is really waste per customer order as opposed to average waste for some meaningful groupings of customer types.  In this instance, monthly boxplots make even more sense.  If you want to have some “control limits” for the boxplots take your prior data, plot it on normal probability paper and identify the .135% and 99.865% points and use these as your +-3 standard deviations limits to visually assess how your process is doing over time. 

    0
    #78728

    Robert Butler
    Participant

    The question that you raised concerning problems with non-normal data has been raised by other posts to this forum.  Many of the replies to questions such as yours are helpful and offer good advice.  There are a few points concerning some of the advice that needs some clarification.
    Box-Cox transform- This transform has been recommended by many and it is indeed a very useful transform.  It allows the investigator to check a huge family of power transforms of the data.  This family includes 1/y, square root of y, and log y.  If you have a package that will permit you to use this transform and if the results of your analysis indicates that no transform is necessary this does not necessarily mean that your data is normal, it only means that a power transform will not be of value to you.  There are, of course other options such as the Johnson transform, Weibull estimation etc.  However, if the issue is one of process capability then it is possible to assess the capability without having to worry about the actual distribution.
      If you take your data and plot it on normal probability paper and identify the .135 and 99.865 percentile values (Z = +-3) then the difference between these two values is the span for producing the middle 99.73% of the process output.  This is the equivalent 6 sigma spread and the capability goal of this spread is to have this equal to .75 x Tolerance.  If you have software that permits you to fit Johnson distributions it will find these values for you but if you don’t the above will permit you to do it by hand.
      If you would like a further reference try Measuring Process Capability by Bothe.  Chapter 8 is titled “Measuring Capability for Non-Normal Variable Data.

    0
    #78613

    Robert Butler
    Participant

    RC,
      What Vijay is asking for is your design matrix with some indication as to which experiments gave a result and which did not. Since I wrote out the complete 2^3 matrix in my second post just indicate which of the eight experiments actually gave you some kind of measureable result.  For example, if 1,2,4,6,7 had measureable results just list them in this order.  Armed with that information we can tell you what you can and cannot estimate with the results from your first experimental effort.

    0
    #78608

    Robert Butler
    Participant

      I wouldn’t recommend substituting 0 for missing values in a design.  If you do this you will drastically alter the model resulting from the analysis.  To check this take the simple 2^3 design below and just assign as a response the numbers 1-8  and build the model. Next set the values 2 and 6 to missing. Your regression diagnostics on the reduced set will show that you cannot estimate the effect of the v1xv2 interaction.  Put all of the other main and two way interactions back in the model expression and, using backward elimination, run the model again. Finally, substitute “0” for those points that you set to missing and run the model the third time. 
    v1 v2 v3    resp1   resp2   resp3
    -1  -1  -1       1           1          1
     1  -1  -1       2           –           0
    -1  1  -1        3           3          3
     1  1  -1        4           4          4
    -1 -1  1         5          5           5
     1 -1  1         6          –            0
    -1  1  1         7          7           7
     1  1  1         8          8            8
    For the first response v1 v2 v3 v1xv2 v1xv3 and v2xv3 are all significant when running a backward elimination with a selection of .1.  In the second case the v1xv2 interaction cannot be estimated but all others can (zero df for error of course) and the coeffieicnts for the remaining terms are the same as those in the first case.  In the last case, all of the terms except v2 and v3 are eliminated because inserting “0” for missing has altered the “measured” responses of the experiments.

    0
    #78566

    Robert Butler
    Participant

    I think Clint has been given some excellent advice concerning the use of Xbar and R charts but I would caution that before he attempts to use these charts to track and perhaps control his process he should first make sure that the data points he extracts for purposes of plotting are independent of one another.  Given the frequency of sampling that he has described I think there is a very good chance that sequential data points are not independent.  If sequential data points exhibit significant autocorrelation the control limits extracted from such data will not reflect the true natural variability of the process.  As a result, the contol limits for Xbar and R will be too narrow and Clint will find himself reacting to “significant changes” in his process which really are not significant at all.  To check for autocorrelation take a block of data and do a time series analysis using the time series module in Minitab (assuming you have that package).  If significant autocorrelation exists you can use the results of your analysis to determine the proper selection of data for use in building an Xbar and R chart.

    0
    #78449

    Robert Butler
    Participant

      There are a number of things you could do.  If we assume that you have not yet run the design and that your problem is that of looking at the proposed combinations and realizing that one or more cannot be run because, as you said, a weld cannot be made, then you can do any of the following:
    1. Change the high and low limits of one or more of the parameters so that the resulting design can be run.
      Possible problem with this-The ranges become so narrow that the region of interest is not the region you wish to investigate.
    2. Choose a RANGE of low values and a RANGE of high values for each of the variables in your design.  Take ONLY those design points that cannot be run and check to see if by substituting a different low or high value in for one or more of the variables in that particular combination you can convert the design point into something that can be run.
      Possible problem with this- your design will no longer be perfectly orthogonal.  In many cases this is more of a theoretical concern than anything else-most designs when they are actually run will not be perfectly orthogonal.  In order to check for trouble make up a dummy response to the resulting design and run it and check the values for the variance inflation factors and also for any ill conditioning warnings that your particular package may issue.  If the design checks out, set it up and run it.
    3. Find a degreed statistician and have him/her build you a restricted design using one of the many optimality criteria (A,G,D etc.)  There are a number of packages on the market that will do this for you but unless you know how to check the resulting design and how to trick the package into doing what you want instead of what it thinks you want you can go wrong with great assurance.
    If  we assume that you have already run the design then your question becomes one of analysis.  For those cases where the horses died just indicate missing data.  Run the full model through you statistics package.  If it is any good it will come back and tell you that one or more of the terms in your model cannot be estimated.  Drop these terms from the initial model and submit the reduced model for consideration.  Keep doing this until you get a series of terms that can be examined.
      For a 2^3 design the dead horses will probably translate into the inability to estimate one or more of the interaction terms. The power of factorial designs is that they are robust-they can really take a beating and still deliver main effect estimates.

    0
    #78443

    Robert Butler
    Participant

      Based on your description I picture a system with a scanner moving vertically across a moving web so that the actual path of the trace is that of a diagonal scan of about a minute duration.  The question that remains is how does the scanner return to it’s initial position – shut off and quick return-resulting in a sawtooth scan, or continuous scanning resulting in a zig-zag pattern back and forth across the web?
      Either way, the structure of the data will result in multiple measurements down the web at the same X,Y position over time.  If your automatic data gathering records the X,Y location of the measurement you can take data and bin it across time for particular sets of X,Y coordinates. For the time (MD) data you will have to check for autocorrelation and identify the time interval needed for data independence if you wish to run a control chart in MD direction.  Frankly, for a first look I wouldn’t even think about a control chart. I’d just plot the data in meaningful ways and look at the results.  For example, for across web (CD) you could stratify by Y across time (we are assuming MD is the X direction) and look at boxplots of the data arranged by Y location.  given that you scan about every minute for about an hour you could also do the CD boxplots  by Y and by time grouping the data in terms of Y location for a particular time interval.  This would give  you a picture of changes Y over longer periods of time.  Time plots of this nature across multiple paper rolls will give you a picture of your process.  Once you have these pictures you can sit down and give some serious thought as to what they suggest about the process and what you might want to do next.
      Obviously this is a lot of plotting.  However, if you tape your sequential plots in time order on a wall so you can really “see” your process you will probably find the resulting pictures to be very interesting.
      I’ve done similar things with rolls of flocked material.  The resulting picture forced a complete rethinking of the process and what it was that we really wanted to with the data we had collected the the information that had been extracted.

    0
    #78370

    Robert Butler
    Participant

    Management thinks that DMAIC is a selection criteria for members of your team and that it stands for Doesn’t Make Any Important Contribution.
    Your champion thinks that rumor, innuendo, and hearsay are the only tools you need for the Define, Measure, and Analyze phases.

    0
    #78349

    Robert Butler
    Participant

    While I agree that Minitab is probably the best choice for general use I do think that it is a mistake to downplay the poor Minitab graphics.  We have found the Minitab graphics to be a real problem not from the “pretty picture” standpoint but from the standpoint that the inflexibility of the graphs do not permit us to do what we need do.  For example, the simple matter of not being able to adjust the settings of the X and Y values on the plots or make them uniform from one graph to the next can be a real problem particularly when you are doing things like sequential boxplots and you need to illustrate the process over multiple graphs scaled in the same manner.
      I use SAS for my analytical work and Statistica for my graphing.  The engineers I work for use Minitab for their analysis and Statistica for graphing. I’d recommend that you get Minitab and get at least one copy of the Statistica Base product for those situations where the Minitab graphics don’t do what you want them to do.
     

    0
    #78276

    Robert Butler
    Participant

      Jan is pointing you in the right direction.  You will have to investigate the issues of stratified random sampling in order to determine your sample size.  With an expensive perfume, you will need to stratify on such things as income, disposable income spent on luxury items, age, ethnic background, religious background, etc.  If you don’t stratify you will wind up with survey results that will tell you the opinions of the number of people you polled and nothing more. If you attempt to extrapolate non-stratified findings to the population at large you will be asking for trouble.
      You may want to check Cochran – Sampling Techniques – Wiley for a discussion of the issues surrounding sampling methods and sample size.

    0
    #78130

    Robert Butler
    Participant

      The usual focus of safety efforts in any situation is to increase safety by reducing accidents or their causes.  Several years ago I did precisely what James A recommended except instead of looking at just a single year worth of data I managed to locate the OSHA reports for the past 20 years.  By running a number of pareto and time series plots of the data I managed to identify several broad trends in accident frequency and types which had completely eluded everyone for the simple reason that most of the safety stats that had been presented over the years were nothing more than a comparison of this year with last year. I was also able to identify groups most at risk.  This last permitted a better focus of safety efforts and it also eliminated a number of meaningless practices that had been put in place.
      Central to the effort was the fact that in spite of the time involved there had been no major change in accident type over the 20 years.  The single biggest problem that I had with the analysis was convincing management that the analysis was valid.  The argument offered was that things had changed so much that I was comparing apples and oranges by looking at such a long time line.  The rebuttal to this was that I was able to show that for any multi-year period there was no significant difference in types of accidents and their frequencies.
      I’m offering this war story to highlight some of the things that you might want to try with your own data.  If you are U.S. based you may have trouble getting more than 5 years worth of data.  There is/was a 5 year limit on data retention and many companies will delete data the second 5 years have passed.  Even 5 years of data will be better than one or two and you have a better chance of spotting trends that simply won’t be apparent with one or two years worth of data.  You might also want to contrast the OSHA and non-OSHA types of accidents over time.  In my case there was pretty good agreement but you won’t know for sure unless you check.
     

    0
    #78103

    Robert Butler
    Participant

      If you have a program that will permit graph overlays or one that will permit you to plot multiple Y’s against a given X then you should be able to do Multivariate plotting.
      I would recommend taking each response of interest and normalizing them to their respective +-3sigma limits.  If we call the -3sigma limit Ymin and the +3sigma limit Ymax then to normalize each Y to a -1 to 1 range you would compute the following:
    y(norm) = (y-Yavg)/Ydiff
    where Yavg = (Ymax+Ymin)/2
    and
    Ydiff = (Ymax – Ymin)/2
    overlayed plots of the averages and ranges of the normalized Y’s will give you your multivariate plots.
      The only problem I can see with graphs of this sort is that for more than 3 Y’s or so the chart will run the risk of being so busy that its utility as an analytical tool may be diminished.

    0
    #78012

    Robert Butler
    Participant

    Here are some of my favorites:
     “In matters of scientific investigation the method that should be employed is think, plan, calculate, experiment and first, last, and foremost, think.  The method most often employed is wonder, guess, putter, theoize, guess again, and above all avoid calculation.”
    -A.G. Webster 1910
    “I don’t mind lying, but I can’t stand inaccuracy.” – Samuel Butler
    “It’s better to be approximately right than exactly wrong.” – Can’t remember author on this one
    The insult
    “There’s lies, damned lies, and statistics.” -Twain and others
    The response
    “Only the truly educated can be moved to tears by statistics.” G.B. Shaw
    “Ignorance won’t kill you but it will make you sweat a lot.” -African Proverb
     “We trained hard-but it seemed that every time we were beginning to form up into teams we would be reorganized. I was to learn later in life that we tend to meet any new situation by reorganizing; and a wonderful method it can be for creating the illusion of progress while producing confusion, inefficiency and demoralization.” – Petronius; 70 A.D.
      “There are no uninteresting  problems. There are just disinterested people.”
        -John Bailar Jr.
      “When attacking a problem the good scientist will utilize anything that suggests itself as a weapon.” – George Kaplan
      “There is nothing more tragic than the spectacle of a beautiful theory murdered by a brutal gang of facts.” – Don’t know this author
     “Every now and then a fool must, by chance, be right.” – Don’t know this author

    0
    #77958

    Robert Butler
    Participant

      ANOM and ANOVA under many circumstances will give similar results.  The key difference between the two is that ANOM compares the sub-group mean to the grand mean of the data under examination while ANOVA compares the sub-group means with one another.  Thus the answer to the question “which one is better” is: it depends on the question you wish to answer.  If you have access to Minitab open up the help document and do a search on ANOM.  You will get a brief overview as well as a number of cites of critical papers on this subject.
      If you don’t have access to Minitab then it would appear that the following paper should help answer any other questions you might have:
    Ott – Analysis of Means-A Graphical Procedure – JQT, 15, pp.10-18, 1983

    0
    #77329

    Robert Butler
    Participant

      I guess we need clarification from Aush.  The way I read his/her second post it is possible to set the two independently it’s just that they drift over time.  If this is the case then it is possible to set up the (+,+), (-,+) etc. combinations and run the experiment.  On the other hand, if they are linked so that only the (+,+) and
    (-,-) combinations can be run then, of course, it is not possible to run a design with these two factors since they cannot be varied independently of one another.

    0
    #77325

    Robert Butler
    Participant

      If you can set your variables so that when they are at the “low setting” they drift over some range of values, all of which are lower than the lowest value of your “high setting” , you can go ahead and generate your experimental design in the usual fashion.  When it comes to analyzing the design you will have to normalize the range of low and high values.  The resulting design matrix will not be a field of -1’s and 1’s but it will be an array with values between -1 and 1. 
      Such a design will not be perfrectly orthogonal but you will have enough separation of effects to enable you to make statements concerning the effects of your main variables.  As for being able to identify interaction effects, the answer will depend on the scatter in your low and high settings.
      To simulate this, set up a standard design.  Choose a low and a high value and then pick a range around these values and substitute random values from the low and the high values into your design.  Re-normalize the design and then check the matrix using some of the diagnostic tools available.  In particular, look at the aliasing structure.  If you have access to something like SAS, or you know someone who can run it for you, have them put the matrix through Proc Reg and run it with the “vif” and “collin” options.

    0
    #77324

    Robert Butler
    Participant

    The exponential probability density function is:
    p(x) = Theta*exp(-Theta*x)
    The average for an exponential distribution is 1/Theta
    The variance for an exponential distribution is 1/Theta^2
    Thus for a mean of 4 minutes we have
    4 = 1/Theta so Theta = .25
    The probability that the time for stamping is less than three minutes is
    P{X<3} = 1 – 1/(exp(Theta*3) = 1 – 1/(exp(.25*3) = .53
    So, regardless of the time interval chosen you can expect to see parts turned out in less than 3 minutes about 53% of the time.
    Check Brownlee-Statistical Theory and Methodology pp. 42- 60 for more details.

    0
    #77212

    Robert Butler
    Participant

    As I read the posts and the replies I’m left with the impression that there are a number of issues here, however, the main thrust of your question seems to focus on the issue of control limits for a log-normal distribution.  The way to deal with this is as follows.
    1. log your data
    2. compute the average (Mln) and standard deviation (Sln) of the logged terms
    i.e. Mln = (Sum (logx))/N,   Sln =sqrt[( Sum (logx – Mln)^2)/N-1]
    3. For control limits in terms of the original, unlogged terms compute the following:
    exp(Mln – t(1-alpha/2)*Sln = lower  limit
    exp(Mln + t(1-alpha/2)*Sln = upper limit.
    These limits will be not be symmetric in the unlogged units but they will include whatever percent of the population you wish to call acceptable i.e. alpha = .05 will give 95% limits etc.

    0
    #77142

    Robert Butler
    Participant

      As James A. observed, the trimming function will remove X% of the lowest and highest values in a distribution.  The idea here is to minimize the effect of “outliers” on the estimate of the mean.  Like all statistical methods this should not be done blindly.  If you have a situation where your data is normal and in the course of your investigation there was a upset that generated a few extreme points that will impact your estimate of the mean then trimming may be of value.  If, on the other hand, your data is not normal-too heavy tails, distribution skewed to one side, etc. then trimming will encourage underestimation and, as the inquisition said-you will go wrong with great assurance.

    0
    #77050

    Robert Butler
    Participant

      Neural nets like MARS, are just one of a number of black box non-linear regression techniques.  Early papers on the subject tried to compare the functioning of the nets with that of the brain.  Current papers and books have backed away from such claims. Black box regression methods are acceptable if you are only interested in mapping inputs to outputs and do not care about cause and effect. Thus neural nets have had demonstated success in detection of credit card fraud and determining the liquid level in a complex shaped vessels.  
       The biggest problem with the net literature is that an awful lot of it is written and reviewed by people with no real understanding of statistics.  Consequently, you will find paper after paper whose claims and conclusions are based on what can only be characterized as a misunderstanding of the advantages and disadvantages of statistical analysis. If you are looking for a good introduction to the topic as well as a list of credible authors I’d recommend the following:
    Neural Networks in Applied Statistics – Stern – Technometrics, August 1996, Vol 38  #3 
    For what it is worth below is a quick translation of some net terms to those of regression:
                   Neural Nets                          Regression Analysis
                   Training Set                             Initial Data                Weights                                   Coefficients               Learning                                  Parameter Estimation               Optimal Brain Surgery           Model Selection/Reduction               Network                                   Statistical Model               Bias                                          Constant               Nodes                                      Sums and Transformations

    0
Viewing 100 posts - 2,401 through 2,500 (of 2,532 total)