iSixSigma

John Noguera

Forum Replies Created

Forum Replies Created

Viewing 100 posts - 1 through 100 (of 136 total)
  • Author
    Posts
  • #185032

    John Noguera
    Participant

    I agree with your recommendations and the possibility of a bimodal distribution, but caution that the gaps may also be due to sampling error given the small sample size of 35.

    0
    #185014

    John Noguera
    Participant

    Angie,
    Thanks for sharing the data. The problem you have with fitting your data is probably due to the “chunkiness” of your data, likely due to limitation in measurement discrimination.Andrew Sleeper’s book “Six Sigma Distributions” has a good discussion on this topic.
    SigmaXL found that the best fit was a three parameter loglogistic, but unfortunately had an AD p-value = .001, indicating a poor fit.

    0
    #177450

    John Noguera
    Participant

    Rene,
    One other thing, when you say that found some transformations that visually look to fit, I assume that the AD p-values were still <.05.  Do you see "chunky" data, i.e. vertical lines with the same values?  This takes us back to measurent discrimination. If this is the case, I would go with the transformation that gives you the best fit, recognizing that the capability will be approximate, but still better than using the untransformed raw data.

    0
    #177445

    John Noguera
    Participant

    Rene,
    Darth’s comment is correct.  In fact the reason for the non-normality and inability to transform could simply be that you have a truncated normal distribution due to the inspection.
    If this is not the case, then check for the usual suspects: outliers (with special causes), bimodal (a hidden X factor), and limited measurement discrimination (as discussed).
    If none of the above apply, then you should consider all of:
    Box-Cox
    Johnson
    Families of Distributions
    Clements (Pearson) and Burr
    Response Model Methodology (by Haim Shore – not to be confused with Response Surface Methodology).
     

    0
    #159730

    John Noguera
    Participant

    Robert’s recommendation is good. Here are some additional things to consider:
    You might be able to use Box-Cox if you add 1 to your data and spec limits.
    Johnson is a system of non-linear transformations (logarithmic and inverse hyperbolic sine). See Minitab’s Help on Methods and Formulas.
    See also:
    Y. Chou, A.M. Polansky, and R.L. Mason (1998). “Transforming nonnormal Data to Normality in Statistical Process Control,” Journal of Quality Technology, 30, April, pp 133-141.
    Nicholas R. Farnum (1996). “Using Johnson Curves to Describe Non-Normal Process Data,” Quality Engineering, Vol. 9, No. 2, December, 329-336.
    Box-Cox and Johnson can be applied to t-tests if the data are non-normal and sample size is small.  Central limit theorem will work if the sample size is greater than 80 for extreme skewness (Skew approx = 2), 50 for moderate skewness (Skew approx = 1), 30 for extreme kurtosis (Kurt approx = 2) and 15 for moderate kurtosis (Kurt approx = 1).
    See J.F. Ratcliffe, “The Effect on the t Distribution of Non-normailty in the Sampled Population,” Applied Statistics, Vol. 17, No. 1 (1968), pp. 42-48.
    I just happen to be doing research on this topic and will soon publish a paper with an equation that gives a precise:
    Nmin for normality = f(Skew, Kurt).
     

    0
    #143318

    John Noguera
    Participant

    Holly,
    You have to be careful in comparing results from different analyses. One Factor ANOVA assumes that your factor is fixed, and the null hypothesis is equality of means.  On the other hand Variance Components Analysis assumes that your factor(s) is random and the null hypothesis is equality of variance.
    Having said that, using your variance component study to determine how to apply control charts, you now know that 70% of your variation is “unexplained”.  So are there other factors that you can consider such as temporal, location, operator, equipment, etc?  If you are able to reanalyze or redo your SOV study you can include these to determine the largest component and Pareto the variance components.  Hans mentioned specialized control charts for variance components, which could then be applied.
    If you are just getting started with SPC I would probably keep it simple and use classical X-bar & S looking for assignable causes.  As you mature in the use of the tool then you can look at more advanced techniques such as variance components control charts. 

    0
    #142691

    John Noguera
    Participant

    Hey Eric,
    Love that wry sense of humor!
    John
     

    0
    #137330

    John Noguera
    Participant

    Hi Darth,
    Thanks for your post.  Good comments regarding the scale.  Your reference paper does, in the conclusion, allow that the t-test may be somewhat robust.
    Ordinal Logistic Regression can also be used for comparisons if you dummy code the predictors.
    The challenge is what does one teach at the  Green Belt level?

    0
    #137310

    John Noguera
    Participant

    As an analogy, the electric vacuum would be the common tool used by a general practioner.  The advanced tools would be akin to the tools used by a professional  carpet cleaner.  
    See http://www.upa.pdx.edu/IOA/newsom/da1/ho_levels.doc
    “Common PracticeAlthough Likert-type scales are technically ordinal scales, most researchers treat them as continuous variables and use normal theory statistics with them.  When there are 5 or more categories there is relatively little harm in doing this (Johnson & Creech, 1983; Zumbo & Zimmerman, 1993).  Most researchers probably also use these statistics when there are 4 ordinal categories, although this may be problematic at times.  Note that this distinction applies to the dependent variable used in the analysis, not necessarily the response categories used in a survey whenever multiple items are combined (e.g., by computing the mean or sum).  Once two or more Likert or ordinal items are combined, the number of possible values for the composite variable begin to increase beyond 5 categories.  Thus, it is quite common practice to treat these composite scores as continuous variables.”

    0
    #137292

    John Noguera
    Participant

    Likert data is ordinal. The “technically correct” analysis with this type of data is Ordinal Logistic Regression and/or Kendall’s Coefficient of Concordance and would be the tools of choice for a Statistician.
    Treating Ordinal data as continuous is however a reasonable simplifying approximation.  Therefore use of T-tests and ANOVA are not incorrect.

    0
    #130483

    John Noguera
    Participant

    Let’s try that again:
    It depends – what is driving your non-normality – skewness, bimodal, outliers?
    If the data is symmetric or moderately skewed and n for each sample is > 15 central limit theorem will give you approximate validity.
    You can also try a box-cox transformation. If you find a suitable transformation, the same transformation must be applied to both samples.
    The Mann-Whitney test does assume approximately equal variance but is robust against outliers.
    The most robust test that is readily available would be Mood’s Median which will work for two samples.
    A little known test that can also be used here is but is not readily available in Six Sigma statistical software is the Kolmogorov-Smirnov Two Sample Test.

    0
    #130482

    John Noguera
    Participant

    It depends – what is driving your non-normality – skewness, bimodal, outliers?
    If the data is symmetric or moderately slewed and n for each sample is > 15 central limit theorem will give you approximate validity.
    You can also try a box-cox transformation. If you find a suitable transformation, the same transformation must be applied to both samples.
    The Mann-Whitney test does assume approximately equal variance but is robust against outliers.
    The most robust test that is readily available would be Mood’s Median which will work for two samples.
    A little known test that can also be used here is but is not readily available in Six Sigma statistical software is the Kolmogorov-Smirnov Two Sample Test.
     
     or has outliers, is your data non-normal due to moderate skewed, severe skewed, bimodal,

    0
    #130444

    John Noguera
    Participant

    Retract – use two sample t with unequal variance. 

    0
    #130443

    John Noguera
    Participant

    Use Welch’s ANOVA ( Assuming that your sample size is large enough for Central Limit Theorem to work).  JMP and SigmaXL include this tool.

    0
    #124317

    John Noguera
    Participant

    Geckho,
    It is an interesting discussion.  I do appreciate your answer, but I believe that if you look at the work done by people who have dedicated their careers to best practices in SPC (e.g. Donald Wheeler) you will find a consistent theme: Do not recalculate established limits unless you have deliberately improved/changed the process.
    I would challenge you to find any published work (other than Minitab’s help) where auto-recalculation is considered acceptable.

    0
    #124312

    John Noguera
    Participant

    Gekho,
    Sorry I did not properly answer your 3rd question.  It was a moving window but I am not sure of the window size. 
     

    0
    #124311

    John Noguera
    Participant

    Geckho,
    No problem.   See answers below.
    Was it an automated SPC system? Yes.
    Was there an actual person looking at the data, or was it all recorded, analyzed, and controlled by the equipment? There was an owner of the data, but this person incorrectly assumed that they would be alerted whenever there was a problem.
    Were they recalculating using all of the points, or did the calculations only look at the last x number of subgroups? The control limits were automatically recalculated with every new data point.
    Did the database constantly refresh, preventing them from looking at the “whole picture”? I’m not sure what you mean by database refresh.  The problem was automatic recalculation of control limits led to a false picture that the process was stable. 
    Which tests were they using to detect out-of-control conditions? Only Test 1.  An interesting theoretical question would be whether or not the drift would have been detected had Tests 2-8 or EWMA been used.  But again the root cause was the auto-recalculation of limits.

    0
    #124310

    John Noguera
    Participant

    Dave,
    Thanks for your reply.  The issue I have here is  the default of automatic recalculation of limits when the “Update Graph Automatically” is turned on.  You should set the default to continue using existing limits (and provide an optional auto-recalculate).
    This is a matter of poka-yoke not training.  For example you do this very well in Box Cox where you do not even display columns containing values <=0.

    0
    #124299

    John Noguera
    Participant

    Darth is correct.  Automatic recalculation of control limits is wrong.  I have personally seen a semiconductor supply company use automatic recalculation to their detriment.  There was a slow long term drift over a one year period of time. They totally missed it because spc alarms thafailed to trigger due to the automatic recalculation of limits.
    I do not like to speak ill of Minitab, but this is one area where they promote bad practice. (Sorry Keith et al).  This is an opportunity for improvement (14.3?).

    0
    #121824

    John Noguera
    Participant

    In this case there is a better alternative to Kruskal-Wallis or Mood’s Median. It is called Welch’s ANOVA currently only available in JMP, but soon to be included in another tool. 

    0
    #121527

    John Noguera
    Participant

    Hi Lynette,
    Please send me an e-mail: jnoguera at jgna dot com.

    0
    #121437

    John Noguera
    Participant

    I am fine. Keeping very busy with MU internal/external and my usual international. Marc B is  now with us.
    Is Gabrielle is a pseudonym for Roberta?

    0
    #121426

    John Noguera
    Participant

    Sorry how could I forget – is this Roberta?

    0
    #121425

    John Noguera
    Participant

    Dawn? Lynette? Julie?
    How are you! By the way it was 2001 – minor correction :-)

    0
    #120356

    John Noguera
    Participant

    Neil,
    Sorry I did not mean to imply that the Pearson Standardized residuals would satisfy a KS or AD test for normality.   The usefulness of these residuals is to examine potential outliers.  Leverage is also important to consider.
    A good reference is “Regression Models for Categorical and Limited Dependent Variables” by J Scott Long.

    0
    #120296

    John Noguera
    Participant

    Neil,
    Try looking at the the Pearson Standardized Residuals for the logistic.  Are they “roughly” normal?  Are the Goodness of Fit p-values all > .05? Some software packages also provide a pseudo R-Square value by McFadden.
    Can the proportion of the total area be considered a rate?  If so you may want to consider Poisson regression.
     

    0
    #119528

    John Noguera
    Participant

    Markov Chains are used to take into account conditional probabilities. The practical application to the SS world include reliability modelling and figuring out the average run length characteristics of EWMA, CUSUM or  Shewhart control charts with Western Electric and Nelson rules turned on.
    The math is not so simple so in typical practise monte carlo simulations are used rather than Markov Chains. 
     

    0
    #118330

    John Noguera
    Participant

    Blur,
    G-Square (Likelihood -Ratio Test) can be approximated by a Chi-Square distribution if n/df > 5.  See “Categorical Data Analysis” Second edition, Alan Agresti, Wiley, 2002.

    0
    #118144

    John Noguera
    Participant

    Deep,
    In addition to the previous poster’s comments on Likert and controlling the environmemt variables, it is important that you block by taster or use a paired t-test to compare differences within taster and exclude taster to taster variability.
    Taste test labs also “calibrate” with standard samples. For example a diet cola test would use 3 standards: “low” sweetness, “mid” sweetness and “high” sweetness. ( This may not be feasible in your situation).  
     

    0
    #117921

    John Noguera
    Participant

    To the best of my knowledge Minitab uses standard methods of optimization for the Optimizer.  See:Derringer, G., & Suich, R. (1980). Simultaneous Optimization of several response variables. Journal of Quality Technology, 12, 214-219. However newer techniques employing Genetic Algorithms will give better results. See:FRANCISCO ORTIZ, JR., JAMES R. SIMPSON, AND JOSEPH J. PIGNATIELLO, JR., A Genetic Algorithm Approach to Multiple-Response Optimization, Journal of Quality Technology 432 Vol. 36, No. 4, October 2004

    0
    #114044

    John Noguera
    Participant

    Jeff,
    You deserve a better answer than given by the previous poster!
    This is an interesting question.  Adding WECO and Nelson rules will improve your chart sensitivity to detect small process shifts and, assuming proper corrective action, will reduce the likelihood of a defect.   The easiest way to estimate the impact of the rules to dpm level would be via simulation (otherwise the math requires the use of Markov chains).  You also need to consider the increased alpha risk, or probability of a false alarm
    There are several related papers published in the Journal of Quality Technology and  Technometrics searchable via ASQ’s web site.  Use the key word “Average Run Length”.
    If you have an organization that is mature in use of SPC consider the use of the EWMA chart rather than simply adding  WECO to Shewhart charts.  This gives you the advantage of improved sensitivity without the increased alpha risk. 
     

    0
    #113542

    John Noguera
    Participant

    I purchased the tape in 1990 – not sure if it is still available.
    “Mike & Dick” Video
     Title: Planned ExperimentationM00105Price was approx. $500.Xerox Media CenterWebster, NY585-422-4915I don’t know about “Fight Back”, but the “Against All Odds: Inside Statistics” series has a program that includes the Energizer Bunny to discuss confidence intervals.  See http://www.learner.org/resources/series65.html

    0
    #112932

    John Noguera
    Participant

    Pitzou:
    See JMP’s Stat Graph Guide, page 121.
    The original work is by Welch, B.L. (1951), “On the comparison of several mean values: an alternative approach,” Biometrika 38, 330–336.
     
     

    0
    #112714

    John Noguera
    Participant

    The RAMS Symposium was 1987.

    0
    #112713

    John Noguera
    Participant

    Bill’s original work was in reliability. He developed Six Sigma as a driver to improve product reliability.
    Two papers by Bill Smith:
    “Integrated Product and Process Design to Achieve High Reliability in Both Early and Useful Life of the Product” IEEE Proceedings Annual Reliability and Maintainability Symposium.
    “Six Sigma Design” IEEE Spectrum, Sept 1993, pp 43-47.

    0
    #111614

    John Noguera
    Participant

    Marc,
    I agree with you. Simplicity is best.  But be careful – extremely skewed data is probably better handled by transformation than assuming CLT will work for you.
    In my previous posts I am assuming that there is a theoritical basis for the exponential distribution.

    0
    #111603

    John Noguera
    Participant

    Doug,
    You are welcome.
    The unequal sample size is not a problem.
    You could simply add a shift, or use the three parameter Weibull which includes a shift parameter.  The exponential distribution is a special case of the Weibull (Shape=1).  Note that with your negative values you will get an error message saying the variance-covariance matrix is singular, but the confidence intervals on the scale and shape ratios will work.
    I am assuming that you have a theoretical basis for using the exponential, otherwise the use of more basic tools discussed earlier such as nonparametrics or shift + Box-Cox would be appropriate. 

    0
    #111533

    John Noguera
    Participant

    Sorry I meant “censored”.

    0
    #111532

    John Noguera
    Participant

    Doug,
    Minitab has a built-in hypothesis test for the exponential distribution.  It is found in:
    Stat > Reliability/Survival > Distribution Analysis (Right Censoring) > Parametric Distribution Analysis. Choose Exponential. Click Test.  You can select scale and shape.
    Note that the data does not have to be cencored to use this tool.
     

    0
    #111531

    John Noguera
    Participant

    Doug,
    Kruskal-Wallis is based on ranks, so intuitively unequal variance is not going to be a problem.
    To quote Sheskin:
    It should be pointed out, however, that there is some empirical research which suggests that the sampling distribution for the Kruskal-Wallis test statistic is not as affected by violation of the homogeneity of variance assumption as is the F distribution (which is the sampling distribution for the single-factor between-subjects analysis of variance). One reason cited by various sources for employing the Kruskal-Wallis one-way analysis of variance by ranks, is that by virtue of ranking interval/ratio data a researcher can reduce or eliminate the impact of outliers.

    0
    #110582

    John Noguera
    Participant

    You need to use Weibayes analysis. This is Weibull analysis with no failures (a combined Weibull and Bayesian approach).  Dr. Robert Abernethy’s book “The New Weibull Handbook” is a great resource. The analysis is quite straightforward.  See chapter 6, page 6-10 for a case study very similar to your problem.
    You will need the scale and shape of your original data. Use Minitab’s Reliability tools for this.

    0
    #110112

    John Noguera
    Participant

    Tony,
    Your question is a bit unclear. If you are asking about the availability of columns for Box-Cox Transformation, Minitab hides any columns that have text, or cells with  negative or zero values.

    0
    #108657

    John Noguera
    Participant

    Hi Robert,Thanks for your comments. I agree with you that the actual settings should be used, evaluate VIF and proceed from there. However, as you know, in practice the designs are often analyzed using the original worksheet settings, resulting in increased experimental error and reduction in model “usefulness”.  Part of the problem lies with software tools not providing easy to access/interpret design diagnostics.  

    0
    #108475

    John Noguera
    Participant

    Hi Robert,
    Thanks for your note. Good point about the loss of information with varying Xs.  Of course you can analyze the data using the actual Xs but the design is no longer orthogonal.  Collinearity turns what started as a DOE into a “historical regression” problem.

    0
    #108474

    John Noguera
    Participant

    Hi Andy,Sorry for the delayed response. I am on vacation in Cairo.Thanks for the very interesting link.  In 1992, I was looking into developing a software product that integrated Prof. Zadeh’s Fuzzy Logic theory with classical SPC techniques (even had venture capital interest).  I did not pursue this because of other consulting & software opportunities.I will have to dig deeper but at first glance it appears that the particular strength of fuzzy regression is in outlier detection, as an alternative to robust regression methods.Regards,John

    0
    #108387

    John Noguera
    Participant

    Hi Robert,
    Do you know of any software tools that employ Maximum Likelihood to accomodate errors in X? 

    0
    #108006

    John Noguera
    Participant

    “Non-normal”

    0
    #108005

    John Noguera
    Participant

    Hi Robert,
    Thanks for the reply. 
    By the way, did you see my earlier post with the email from Davis Bothe on extrapolation of Nplots for Non-noramal capability indices?
     

    0
    #107801

    John Noguera
    Participant

    Hi Robert,
    I Have you seen RSM applied in a transactional environment – discrete event simulation or otherwise?

    0
    #107115

    John Noguera
    Participant

    To start with, your sample size must be large enough to satisfy np > = 5. 
    If you want to get an estimate of your sigma level, compute the exact 95% confidence interval based the Inverse Beta distribution.
    If you observe zero defects with a sample size of 1 million, then you can claim Six Sigma quality.
     

    0
    #105276

    John Noguera
    Participant

    Far better to consider Average Run Length (ARL) = 1/(1-Beta).  So in the case of a 1.5 Sigma fhift, using n=4, the ARL = 2.0.  On average you will detect a 1.5 Sigma shift within 2 observations.
    These numbers are not applicable when you add tests for special causes. Markov Chains are required to do the calculations, or estimates can be obtained through simulation. 

    0
    #105215

    John Noguera
    Participant

    A proper guardband comes from your actual Sigma Measurement, not some pre-set 20% number.
    Upper Guardband = USL – 6*Sigma Measurement
    Lower Guardband = LSL + 6 * Sigma Measurement
    Of course you could go with a less conservative multiplier like 5.15 or even 3, but do not use less than 3. 
    As mentioned in my previous post, the Reproducibility part of  Sigma Measurment should include tester-to-tester and day-to-day variability, not just operator-to-operator. It gets more challenging if you have to take into account other factors like temperature drift, but let’s leave that one for now.
    There is one thing you can do in test that would permit a reduction in the size of the gurdband, and that is repeat the tests and average.  This will reduce the Repeatability component (by S Repeat/SQRT(N)) but of course this is more expensive.  Trade off the yield loss versus the cost of test.

    0
    #105103

    John Noguera
    Participant

    6*Sigma Measurement is the simplistic (conservative) answer, but you have to take into account other factors such as drift over temperature, and ensure that Sigma Measurement includes Reproducibility factors such as lab-to-lab and/or tester-to-tester.
    Of course this does not bode well for yield loss, hence the need to have stable, precise and accurate measurement systems.
    I developed a technique in 1986 using regression to compensate for temperature drift.  Let me know if this would be of interest.
     

    0
    #103564

    John Noguera
    Participant

    You cannot “dial-up” the ARL characteristics. They are a function of lambda, stdev multiplier, and sample size  (EWMA) or h, k, sample size (CUSUM). There are also some start-up options for Fast Initial Response.
    Search through Journal of Quality Technology and Technometrics for additional information.
     

    0
    #103127

    John Noguera
    Participant

    Hi Robert:
    I am attaching an e-mail I received from Davis Bothe on this topic.  By the way he has read some of your responses on isixsigma and said that you were “quite knowledgeable and experienced in many statistical methods”!
    Regards,John
    The issue you raise about non-normal distributions is an important one, especially when dealing with small sample sizes (throughout the following discussion, I am assuming the data came from a process that is in statistical control because even just one out-of-control reading in a small sample will greatly influence the apparent shape of the distribution).
    I don’t think defaulting to the minimum and maximum values of the data set for the .135 and 99.865 percentiles is a good idea.  Doing so would produce an artificially high estimate of process capability because the measurement associated with the .135 percentile would be located much farther below the minimum measurement of the data set and the “true” 99.865 percentile would be much greater than the maximum value of the data set.
    For example, in a sample of 30, the smallest measurement would be assigned a percentage plot point of around 2 percent (depending on how you assign percentages).  This point represents the 2.00 percentile (x2.00), which is associated with a much higher measurement value than that associated with the .135 percentile (x.135).
    The largest measurement would be assigned a percentage plot point of somewhere around 98 percent.  This point represents the 98.00 percentile (x98.00), which would be associated with a lower measurement than that associated with the 99.865 percentile (x99.865).
    The difference between x98.00 and x2.00 will therefore be much smaller than the difference between x99.865 and x.135.  Thus, the Equivalent Pp index using the first difference will be much larger than the one computed with the second difference. 
    Equivalent Pp = (USL – LSL) / (x98.00 – x2.00) is greater than 
    Equivalent Pp = (USL – LSL) / (x99.865 – x.135)
     
    So I think it would be best to avoid this approach as it produces overly optimistic results.
     
    So what would be better? I recommend fitting a curve through the points and extrapolating out to the .135 and 99.865 percentiles.  However, this can be difficult when there are few measurements.  First, there should be at least 6 different measurement values in the sample.  Often, with a highly capable process and a typical gage (15% R&R), most measurements will be identical.  For a sample size of 30, I have seen 6 repeated readings of the smallest reading, 19 of the “middle” reading, and 5 of the largest reading.  With only three distinct measurement values, it is impossible to use curve fitting or NOPP.
     
    Second, with really small sample sizes (less than 20), you must extrapolate quite a distance from the first plotted point down to the .135 percentile and quite a ways from the last plotted point to 99.865 percentile.  This presents an opportunity for a sizable error in the accuracy of the extrapolation.  However, I believe this error would always be less than that of using the smallest value as an estimate of the .135 percentile.  This method is always wrong whereas the extrapolation method might produce the correct value.
     
    You had asked about my preference between using the last two plot points for the extrapolation or relying on a smoothing technique.  I would prefer the smoothing technique as this approach takes into consideration all of the plot points.  If there is any type of curvature in the line fitted through the plot points (which there will be since we are talking about non-normal distributions), then this method would extend a curved line out to estimate the .135 and 99.865 percentiles.
     
    The “last-two-points” method will always extend a straight line out to the .135 and 99.865 percentiles (through any two points, there exists a straight line).  Because we are dealing with non-normal distributions, we know a straight line is not the best way to extrapolate.  Thus, the smoothing approach will produce better estimates.
     
    One note of caution: be careful of using a purely automated approach.  It’s always best to look at the data for each process rather than relying solely on an automated computerized analysis.
     

    0
    #103109

    John Noguera
    Participant

    Robert,
    Thanks for the insight and the Hahn Shapiro reference.
    John

    0
    #103102

    John Noguera
    Participant

    The Anderson Darling test is used in the following places:
    SigmaXL > Statistical Tools > Descriptive Statistics
    SigmaXL > Graphical Tools > Histograms & Descriptive Statistics
    SigmaXL > Statistical Tools > 2 Sample Comparison Tests
     

    0
    #103082

    John Noguera
    Participant

    SigmaXL uses the Anderson Darling test.

    0
    #102989

    John Noguera
    Participant

    Thanks Robert. 
    In looking for an alternative for software automation, what’s your thought on the use of Johnson curves verus Box-Cox?
     

    0
    #102924

    John Noguera
    Participant

    Hi Robert,
    I’m pulling up this old thread, to ask you a question about your use of Bothe’s technique for handling non-normal data.  I am planning to implement a similar approach in a software tool.
    When the sample size is small (say < 100), do you extrapolate the curve on the nplot out to the 0.135 and 99.865 percentiles or do you simply take the min and max values?  If you extrapolate, do you use the last two data points and draw the line, or do you use "smoothing"?
    Thanks!
     
     
     

    0
    #100804

    John Noguera
    Participant

    Folks, whether you like Stan or not is irrelevent. He is correct.
    I have seen people use software that automatically recomputes control limits when new data is added (such as Minitab 14 – unless you enter historical mean and stdev – I think that I have persuaded the folks at Minitab to change the defaults on this). 
    Here is the simple problem you run into:  if a slow long term drift occurs in the process – an assignable cause – the recalculation of the limits keeps adjusting, following along with the slow drift, making the detection very difficult.
    The company doing this automatic recalculation did not see that they were out of control.  By the time the problem was recognized, the required adjustment was severe and caused significant customer grief.
     

    0
    #99960

    John Noguera
    Participant

    James:
    Your problem has nothing to do with positive & negative values. Very likely you have small within subgroup variability relative to between subgroup. If you are using Minitab, try the I-MR- R/S (Between/Within) chart. SigmaXL also supports this chart type.
    If this does not resolve your problem, you likely have autocorrelated data. Use Minitab or Excel’s exponential smoothing and chart the residuals.

    0
    #99942

    John Noguera
    Participant

    Hi Jonathon,
    You are correct. I too had lengthy conversations with Bill.  Reliability was a key driver for him.  Get the Cpk right in manufacturing, and this will translate to a significant reduction in latent defects.
    Bill published (at least) two significant papers on Six Sigma: “Six-Sigma Design”, IEEE Spectrum, September 1993, p.43-47,  and an earlier conference paper linking Six Sigma quality and reliability.  
     

    0
    #97198

    John Noguera
    Participant

    Statistical Process Control and Machine Capability in Three DimensionsPadgett, Marcus M.; Vaughn, Lawrence E.; Lester, Joseph; Quality Engineering, Vol. 7, No. 4, JUNE 1995, pp. 779-796

    0
    #96037

    John Noguera
    Participant

    The theory is somewhat complex, requiring the use of Markov Chains.  See: Champ, C.W., and Woodall, W.H. (1987). “Exact Results for Shewhart Control Charts with Supplementary Runs Rules”, Technometrics, 29, 393-399. Much easier to simulate this problem!  The bottom line is that these rules improve the sensitivity to small process changes (i.e. lower the beta risk) while maintaining a reasonable false alarm rate (alpha risk).Better performance can be achieved with the use of more advanced charts like CUSUM or EWMA but these are difficult to implement on the shop floor. 

    0
    #95274

    John Noguera
    Participant

    Hi Mark,
    Thanks for the response.  I personally have not created/analyzed I-optimal designs but there are some who believe they are superioir due to the fact that the predicted variance is being minimized.  JMP saw fit to include this in their latest release.  I don’t know if Minitab plans to do the same.
    I was just curious. Thanks!

    0
    #95259

    John Noguera
    Participant

    Hi Mark,
    Are you planning to incorporate I-optimality soon?

    0
    #95220

    John Noguera
    Participant

    Hi Andy,
    Thanks for the moto info. Have we met? I teach at MU.
     I would like to see the sample charts.  Could you please e-mail me your web-site URL: [email protected].
    I could not see Mahalanobis used in any of the multivariate charts in Minitab V14.  Is this embedded in the algorithm?
     
     

    0
    #95195

    John Noguera
    Participant

    Minitab V14 now supports Multivariate control charts including Hotelling’s T-Squared and Multivariate EWMA.
    These tools are great for analysis, but difficult to implement at a shop floor level.  The challenge with Multivariate SPC is determining what action to take when you get an out-of-control signal.  Looking at individual charts may not show any instability. For example, the combination of a few variables at, say, + 2.5 sigma would trigger an alarm.
    The other challenge you run into is the Beta risk, when the number of variables being monitored gets large (>10).
    Doug Montgomery’s book, “Introduction to Statistical Quality Control” is a great reseource for Multivariate SPC.
    I would be very interested to know if anyone out there has successfully implemented shop floor multivariate SPC.

    0
    #90494

    John Noguera
    Participant

    Correction, KW is not appropriate. Use Friedman’s available in Minitab.

    0
    #90493

    John Noguera
    Participant

    Kruskal-Wallis would work well for you. This tool is available in Minitab and SigmaXL software.

    0
    #87695

    John Noguera
    Participant

    Tony,
    The reason you get a difference between Excel’s calculation of StDev and Minitab’s Overall StDev (in process capability) is the use of unbiasing constant C4, ie S/C4.
     

    0
    #87213

    John Noguera
    Participant

    My brain is fuzzy this morning – need more coffee!
    The within stdev’s retain their respective unbiasing constants. It’s the overall that causes the confusion withing Minitab.

    0
    #87212

    John Noguera
    Participant

    In addition to pooling, Minitab defaults to using unbiasing constant C4.  This makes it difficult to match hand calculations to Minitab.
    By way of introduction to the topic, it is easier to teach short term calculated using Rbar/d2.  I have my students select the Rbar option and deselect the unbiasing constant for purposes of comparing hand calculations to Minitab.
     

    0
    #86598

    John Noguera
    Participant

    I agree with your recommendation on Box-Meyer.  The Schmidt/Launsby text also has Rules of Thumb for number of replications required per run, for a given Beta risk on the Standard Deviation and number of runs in the experiment.

    0
    #86504

    John Noguera
    Participant

    Hi Jonathon!
    How’s the book coming along?
    I wanted to mention that SigmaXL creates pivot charts for Excel ’97.
    Also, I wanted to mention that SAS JMP has a similar powerful tool for Discrete X and Discret Y that they call Mosaic Plots.

    0
    #84643

    John Noguera
    Participant

    Hi Robert,Thanks for your post.  I especially agree with your emphasis on asking why there is a significant difference in variances.  This can lead one to a “golden nugget” X factor; the discovery of such is likely far more valuable than that originally searched for with unequal means.  

    0
    #84589

    John Noguera
    Participant

    Classical ANOVA assumes that the variances are equal. 
    Your choices in Minitab to the following:
    1. Apply a variance stabilizing transformation like ln(Y).
    2. Use a nonparamentric test like Kruskal-Wallis (note here you are testing medians not means).
    3. Limit yourself to 2 samples and apply t-test for unequal variance.
    If you have access to JMP, they have a procedure called Welch’s ANOVA which does not assume equal variance.

    0
    #83477

    John Noguera
    Participant

    Hi Chuck!
     
    Good answer.  I will use your example from here on!
    By the way I did some interesting simulation studies on the difference between multiplicative (ie FMEA) vs additive.  The rank results match closely at the high and low end but vary widely in the middle. 
     

    0
    #82994

    John Noguera
    Participant

    I concur with Robert, but would add that VIF score should be included (with a clear explanation linking score to degree of multicollinearity).
    Any relation to Brad Jones?
     

    0
    #81757

    John Noguera
    Participant

    John,
    Good answer.  Just one thing to add – JMP 5 now supports PLS.
     

    0
    #80963

    John Noguera
    Participant

    It looks like you are having problems with rounding. The NORMSINV function is fine.  Here is what I get with your numbers:

    N=
    1

    D=
    4719

    O=
    1352200000

    dpmo
    3.48986836

    Sigma=
    5.99450016

    N=
    1

    D=
    1413

    O=
    418900000

    dpmo
    3.37312008

    Sigma=
    6.00173753

    0
    #80407

    John Noguera
    Participant

    The only assumptions for Mann-Whitney are: continuous response and both samples random and independent.
    Since the test is based on ranks, non-normality and/or unequal variance does not affect the outcome.   MW is the preferred test here over t, especially with small sample sizes.
    Note that you are testing for unequal medians not means.

    0
    #78243

    John Noguera
    Participant

    Correction!
    Time for coffee….
    If Cpk is estimated to be 1.5 the lower 95% confidence is 1.1 if n=30;  if n=5 the lower limit is approximately 0.5!
     

    0
    #78242

    John Noguera
    Participant

    Cpk is a statistic. You are estimating a population capability.  Therefore sample size is very important to obtain a reasonable margin of error or confidence interval.
    With n=5, and a calculated Cpk=1.5 (i.e. “Six Sigma” quality), your approximate 95% lower confidence limit is Cpk=1.1.
    I believe Minitab 14 will be coming out with confidence intervals for Cpk.

    0
    #78152

    John Noguera
    Participant

    Hi Ken,
    I presented this rational to Ron Lawson at Motorola about 10 years ago.  He was in agreement with this, but the primary driver of 1.5 sigma was the Bender paper and empirical/simulation modelling.

    0
    #78000

    John Noguera
    Participant

    John,
    Just curious, are you planning to get involved with the rollout of the new GB program at Mot?  Send me an e-mail at [email protected]
    I just saw the dates for the Lean Six Sigma session.  I wish I could attend – I live in Toronto as well, but I will be in Schaumburg then.
     

    0
    #77416

    John Noguera
    Participant

    There is no Six Sigma guide, however Eckes book ‘The Six Sigma Revolution’ is available in Palm Reader. There are a few basic statistics programs available (PDA Stats, ZEN Statistical Expert, and ProStats).  See the Palm and Handango web sites for further info.
    You might want to check with the people who publish pocket guides like Rath & Strong and GOAL/QPC to see if they have any plans to port to the Palm.

    0
    #76413

    John Noguera
    Participant

    Your analysis looks correct, but in this case a simpler approach could be used –  the two sample proportions test. The p-values work out to be the same, but it is easier to do.
    In Minitab: Stat > Basic Statistics > 2 Proportions. Use summarized data. Recommended that under options you check the “used pooled estimate of p for test” option.

    0
    #76304

    John Noguera
    Participant

    A great reference text, published by the ASQ: Shapiro, Samuel S.,”How to Test Normality and Other Distributional Assumptions”, Revised Edition.  Volume 3 of the How To Series.
    Note that if you want to determine the p-values from A-squared you will have to use linear interplotation or non-linear regression on the values given in the tables.  There is no convenient equivalent to Chi-Square.

    0
    #76021

    John Noguera
    Participant

    S is the standard deviation of the unexplained variation (ie residuals).  Note that the degrees of freedom for this S term will be N – (# terms in model including constant).  It can also be calculated using the SQRT of the Adj MS  for Residual Error.
    Re the lack of fit – you need replicates to get the pure error.  Did you run only one center point?
     
     

    0
    #75921

    John Noguera
    Participant

    Mike,
    Things have changed for the better since you left. Read the lead article in the May 2002 Six Sigma Forum “Motorola’s Next Generation Six Sigma” by Matt Barney. 
    This article reflects what MU is doing today.
     

    0
    #75551

    John Noguera
    Participant

    You could lower the individual alphas to obtain an overall alpha of .05.  This has the advantage of simplicity but the disadvantage of increased beta.  Alternatively you could apply Hotelling’s T-Squared to the test differences.  (MANOVA will not work here).

    0
    #75463

    John Noguera
    Participant

    See http://www.excelstat.com for a helpful presentation on pareto and multi-vari.

    0
    #75292

    John Noguera
    Participant

    1. Probability is not always intuitive.
    2. Black Belts must understand probability to be effective.

    0
    #75124

    John Noguera
    Participant

    “So going into the game you have a 1/3 chance of winning” Yes – and that does not change if you do not switch!
    Now after the door is open, if you switch, you are taking advantage of the knowledge of the door opener.  You are “buying into” the 2/3 probability associated with both of the other doors.
    Previously an anonymous poster said that several PhDs were stumped on this one.  Those were letters written to Marylin Vos Savant insisting that she was wrong when she stated that the correct probabilities were those posted by Ex-MBB and Calybos.  Unfortunately the PhDs were wrong.

    0
    #75117

    John Noguera
    Participant

    Ex-MBB is correct. Switching allows you to take advantage of the person’s knowledge who is opening the door.

    0
    #75031

    John Noguera
    Participant

    Michael Hammer, “Process Management and the Future of Six Sigma,”  MIT Sloan Management Review, Winter 2002, Volume 4, No 2, pp 26-32.
    Hal Plotkin, “Six Sigma: What It Is and How To Use It” Harvard Management Update Article, 6/1/99
     

    0
    #74635

    John Noguera
    Participant

    This topic has been covered before, but let me summarize. If you consider the use of add-ins, you really do get a lot of power (and statistical precision).  Consider the following tools:
    Staplus, which comes with the text “Data Analysis with Microsoft Excel” by Berk & Carey. This includes normal probability plots, multiple histograms, multiple scatterplots, and nonparametric stats.
    ExcelStat, see http://www.excelstat.com. This includes Multiple Pareto and Multi-Vari.
    SPC XL, by Air Academy Associates, http://www.airacad.com. Excellent SPC.
    DOEKISS, DOEPRO, By Air Academy Associates, Excellent DOE tools.
    Typically if one has Minitab or JMP, they would not use the Excel tools. If however you want to provide people with tools that are easy to learn, and less expensive, then the Excel add-ins are the way to go.  For example, Motorola University uses Statplus and ExcelStat in their Green Belt training.
     

    0
    #56158

    John Noguera
    Participant

    Bonjour Yves,
    Will you be doing the training in France?

    0
Viewing 100 posts - 1 through 100 (of 136 total)