iSixSigma

Robert Butler

Forum Replies Created

Forum Replies Created

Viewing 100 posts - 1 through 100 (of 2,475 total)
  • Author
    Posts
  • #247865

    Robert Butler
    Participant

    So if I’m understanding correctly the question you are asking is how to generate a fit to the data in your plot that will be in the form of a predictive equation.

    If that’s the case then, based on just looking at the plots, my first try would be a simple linear regression using the terms pipe motion, the square of pipe motion, stiffness and the interaction of stiffness and pipe motion and/or the interaction of stiffness and pipe motion squared.

    The other first try option could be a form of one of Hoerl’s special functions with an additional interaction term.

    The linear form would be ln(vessel motion) = function of ln(pipe motion), pipe motion, stiffness and an interaction term of either stiffness with pipe motion or stiffness with ln(pipe motion).

    For every attempt at model building you will need to run a full residual analysis.  The plots of the residuals vs the predicted, independent variables, etc. will tell you what you need to know as far as things like model adequacy,  missed terms, influential data points, goodness of fit, etc.

    Repeating this exercise with other terms that the residuals might suggest should ultimately result in a reasonable predictive model – be aware – any model of this type will have an error of prediction associated with it and your final decision with respect to model accuracy will have to take this into account.

    If you don’t know much about regression methods you will need to borrow some books through inter-library loan.

    I’d recommend the following:

    Applied Regression Analysis – Draper and Smith – read, understand, and memorize the first chapter – the second chapter is just the first chapter in matrix notation and may not be of much use to you. You will need to read, thoroughly understand and do everything listed in Chapter 3 – The Examination of the Residuals.

    Regression Analysis by Example – Chatterjee and Price – an excellent companion to the above and, as the book title says – it provides lots of examples.

    Fitting Equations to Data – Daniel and Wood – from you standpoint the most useful pages of the book would probably be pages 19-27 (in the second edition – page numbers might be slightly different in later editions) for the chapter titled “One Independent Variable” – yes, I know, you have 2 variables – but the plots and the methods will help get you where you want to go.

    You will also want to follow what these books have to say about models with two independent variables and interaction terms (also known as cross-product terms).

     

    0
    #247854

    Robert Butler
    Participant

    I’m missing something.

    The idea of an experimental design is that you have little or no idea of the functional relationship (if any) between variables you believe will have an effect on the output and the level(s) of the output itself.  You put together a design which consists of a series of experimental combinations of the independent variables of interest and you then go out into the field/factory/seabed floor/ hospital OR/whatever and physically construct the various experimental combinations, run those combinations, get whatever output you may get from the experimental run, and then use that data to construct a model.

    It sounds like you already have a model which (I assume) is based on physical principles and mirrors what is known about seabed stiffness/pipe motion behavior.  If you already have a model (which your graph suggests is the case) then all that is left is either a situation where you want to go out on the seabed and run some actual experiments with pipe motion to see if the current model matches (within prediction limits) what is actually observed or a situation where you want to run some matrix of combinations of seabed stiffness and pipe motion just to see what the model predicts.

    If it is the former situation then the quickest way to test your model would be to actually run a simple 2×2 design where you have two seabed conditions and two levels of pipe motion and where you try to find settings that are as extreme as you can make them.  If it is the latter then, unless it is very costly to run your model, I don’t see why you wouldn’t just run all of the combinations and see what you see.  The problem with the latter is, if you already have an acceptable model, I don’t see the point of running all of the calculations since it doesn’t sound like you have any “gold standard” (actual experimental data) for purposes of comparison.

    Shifting subjects for a minute.  Let’s just talk about DOE.

    You said, “My question is that: how can I define different levels for each factor (e.g. 6 levels for factor 1, and 11 levels for factor 2) in the DOE method? which method in this theory is beneficial for my project?

    I have searched around these topics and some guys said me: ‘the “I-Optimal” method of the RSM is helpful for your project. because in this method you can define several levels for each factor and there are not any issues if levels of each factor are not the same as other factor’s levels’.”

    If what you have written above is an accurate summary of what you were told then it is just plain wrong.

    If the graph you have provided really mirrors what is going on then you need to ask yourself the following question:

    Why would I need all of those levels of stiffness and pipe motion?  Consider this – two data points determine a straight line, 3 data points will define a curve that is quadratic in nature, 4 data points will define a cubic shaped curve, 5 data points will define a quartic curve, and 6 will give you a quintic.  The lines in your graph are simple curves – no inflections, no double or triple inflections – in short no reason they couldn’t be adequately described with measurements at three different levels and thus, no reason to bother with more than three levels.

    As for the notion concerning I-optimality (or any of the other optimalities) – they are just computer aided designs.  The usual reason one will opt for a computer aided design is because there are combinations of independent variables that are known to be hazardous, impossible to run, or are known to provide results that will be of zero interest (for example – we want to make a solid – we know the combination of interest will only result in a liquid) and if we try to examine the region of interest using one of the standard designs we will wind up with one or more experiments of this type in the DOE matrix.

    Finally, I’ve never heard of any design that requires all variables to have the same levels.  If this was said then my guess is what was meant was when you are running a design it is not necessary for, say, all of the high levels of a particular variable to be at exactly the same level.

    If that was the case then, yes, this is true – but this is true for all designs – basically, as long as the range of “high” values for the “high” setting of a given variable do not cross over into the range of the “low” settings for that variable you will be able to use the data from the design to run your analysis.  Of course, if you do have this situation then you will need to run the analysis with the actual levels used and not with the ideal levels of the proposed design.

     

     

     

    0
    #247849

    Robert Butler
    Participant

    Maybe we are just talking past one another but I’m not sure what you mean by “the numerical output from the process creates the shape of the model.”

    A random variable is a function that takes a defined value for every point in sample space. – Statistical Theory and Methodology in Science and Engineering – Brownlee – pp. 21.  Given this I would say the process is the random variable and the output is just a quantification of what the random variable is doing. In other words – the random variable is going to determine the output distribution and all the numerical record is doing is cataloging that fact.

    0
    #247848

    Robert Butler
    Participant

    Based on what you have posted the short answer to your question is – no – DOE won’t/can’t work in this situation.

    The factors in an experimental design require either the ability to change the levels of each factor independently of one another or, in the case of mixture designs, to vary ratios of the variables in the mix.  In your situation you do not have the ability to change stiffness and pipe motion independently of one another – indeed it does not sound like you have the ability to change either of these variables in the real world setting.

    The point of a DOE is to provide outputs that are the result of controlled, organized changes in inputs.  The DOE does not predict anything it only acts as a means of data gathering and once that data has been gathered you have to apply analytical tools such as regression to the existing data in order to understand/model the outputs of the experiments that were part of the DOE.

    1
    #247444

    Robert Butler
    Participant

    I’m not sure what you mean by the regular rules.  If you mean the issue of control limits based on standard deviations about the means then you can still use them.  The issue is one of expressing the data in a form that will allow you to do this.  When you have an absolute lower bound which is 0 the usual practice is to add a tiny increment to the 0 values, log the data, find the mean and standard deviations in log units and then back transform.  What you will get will be a plot with asymmetric control limits.  Once you have that you can press on as usual.

    0
    #247378

    Robert Butler
    Participant

    That would make sense if the aspect of the supplements you believe most impacts the outcome is the percentage of total nitrogen, otherwise it won’t matter.

    The idea is if total nitrogen was understood to be the critical component of the supplements you would chose the two supplements with the lowest and highest levels of total nitrogen and run the analysis on the basis of nitrogen content. This would allow you to treat the supplements as continuous.

    Another option would be sub-classes of raw materials and supplements.  In that case you might have some raw materials with an underlying continuous variable of one kind and others with a different one.  If you have this case then you would do the same thing as mentioned above.  Of course, this would mean building experiments with combinations of raw materials and supplements.  My guess would be you couldn’t do this with the catalysts so they would have to remain categorical variables.  However, if you can express the raw materials and supplements as functions of critical underlying continuous variables and can mix them together then the number of experiments would be drastically reduced.

    For example, let’s say your raw materials fall into three groups with respect to a critical continuous variable and your supplements can be grouped into 4 separate groups – you could put together an 8 run saturated design with these and then run the saturated design for each one of the separate catalysts. That would give you a 32 run minimum.  If you tossed in a couple of replicates (I’d go for a random draw from each of the 4 saturated designs so I had one replicate for each catalyst) you would have 36 runs and you could get on with the analysis.

    0
    #247351

    Robert Butler
    Participant

    If you have a situation where you can only run one raw material with no combinations of other raw materials and similarly for the catalysts and the supplements and you have no underlying variables that would allow you to convert them the continuous measures then you are stuck with running all the possible combinations which would be 480.  I understand the need for blocking so what you will want to do is draw up the list of the 480 combinations, put them in order on an Excel spread sheet – go out on the web and find a random number generator that will generate without replacement and generate the a random number sequence for 1-480.  Take that list, enter it in a second column in the Excel spread sheet so there is a 1-1 match between the two columns, sort on the second column so now 1-480 are in order and use that template to guide you in your selection of experiments (the first 10, then the second etc.)

    As a check it would be worth randomly adding into the list a few of the experiments (again do a random selection) to permit a check with respect to run-to-run variability.

     

     

    0
    #247345

    Robert Butler
    Participant

    You will have to provide some more information before anyone can offer much of anything in the way of advice.  You said, “I have 15 difference raw materials (A1, A2… A15), 4 catalysts (C1, C2, C3, C4) and 8 supplements (B1, B2…B8). ”

    Questions:

    1. Is this a situation where the final process can have combinations of raw materials, catalysts and supplements or is this a situation where you can only have a raw material, a catalyst and a supplement?

    2. If this is the first case and you have absolutely no prior information concerning the impact of any of the raw materials, catalysts, or supplements on your measured response and you have no sense of any kind of a rank ordering of raw materials, catalysts, or supplements as far as what you think they might do then your best bet would be to construct a D optimal screen design with 27 variables – this would probably be a design with a maximum experiment count of 40 or less.

    a. If you do have prior information and you can toss in any combination from raw materials, catalysts, and supplements then I’d recommend building a smaller screen design using only those items from raw materials, catalysts, and supplements that you suspect will have either a minimum impact or a major impact on the response.

    3. If this is a case where you can only have one from column A, one from column B and one from column C is there any way to characterize the raw materials, catalysts, and supplements with some underlying physical/chemical variable that can will allow you to treat the raw materials, catalysts and supplements as continuous variables?

    a. For example – let’s say the raw materials are from different suppliers but they are all the same sort of thing (plastic resin for example) and their primary differences can be summarized by examining say their funnel flow rates, the catalysts are all from different suppliers but they can be characterized by their efficiency at doing what they do and the supplements can also be characterized by whatever aspect of the final product they impact – say viscosity values.  If you have this situation then what you have is a situation with three variables at two levels.  In this case the lowest level for A would be the raw material that had the least of whatever it is and the high level would be the raw material that had the highest level of whatever and so on for catalyst and supplements.  This would be a design of about 10 experiments – a 2**3 with a couple of repeats.

    4. If there isn’t any way to characterize the variables in the three categories and you have to treat everything as a type variable then you are stuck with having to run all combinations – 480.

    1
    #247303

    Robert Butler
    Participant

    The short answer is the CLT doesn’t have anything to do with the capability analysis.

    You are correct – the CLT refers to distributions of means not to the distribution of any kind of single samples from anything.  The statement from the WI (whatever that is)  “based on the CLT, a minimum sample size of n ≥ 30 (randomly and independently sampled parts) is required for a capability analysis. The stated rationale is that generally speaking, n ≥ 30 is a “large enough sample for the CLT to take effect”.”  comes from a complete misunderstanding of the CLT.  As for the second part about the 30 samples rule of thumb – that’s not worth much either.

    If you need a counter-example to help someone understand the absurdity of that statement consider the case where the choices are binary and you take a sample of 30 and get a group of 0’s and a group of 1’s.  If you make a histogram of your measures you will have two bars – one at the value of 0 and the other at the value of 1 and no matter how hard you try that histogram cannot be made to look normal.

    For a capability analysis you have to have some sort of confirmation that the single samples you have gathered exhibit an acceptable degree of normality.  The easiest way to do this is to take the data and plot it on normal probability paper and apply the fat pencil test.  If it passes that test you can use the data to generate an estimate of capability.

    If it doesn’t, in particular if the plot diverges wildly from the perfect normal line and exhibits some kind of extreme curvature, you should consult Chapter 8 in Measuring Process Capability by Bothe. It is titled “Measuring Capability for Non-Normal Variable Data”.  That chapter provides the method for capability calculations when the data is non-normal.

    0
    #247224

    Robert Butler
    Participant

    Control charts don’t test for normal distribution and, while the standard calculation for process capability requires approximately normal data, it is also true that process capability can be determined for systems where the natural underlying distribution of the data is non-normal.  Chapter 8 in Bothe’s book Measuring Process Capability titled “Measuring Capability for Non-Normal Variable Data” has the details concerning the calculation.

    0
    #247032

    Robert Butler
    Participant

    @andy-parr  it’s a spammer.  They get through from time to time – I’ve had the same person/entity trying for the same thing.  I’ve seen a few in the past as well.  This one is just more persistent.

    1
    #246995

    Robert Butler
    Participant

    From what I found after a quick Google search for the definition of TMV the big issue with that process is exactly what you stated in your second post: TMV focuses on the issue of repeatability and reproducibility.  Since assessment of repeatability and reproducibility are the main issue and since it was what I thought you were asking in your first post your question collapsed to one of determining a measure of the variability that could be used for test method validation. It was/is this question I am addressing and it has nothing to do with skill in using, or knowledge about, TMV.

    The kind of process variability you will need for your TMV will have to reflect only the ordinary variability of the process. Ordinary process variation is bereft of special cause variation.  The variability associated with bi-modal or tri-modal data contains special cause variation.  As a result, should you try to use the variability measure from such data for a TMV your results will be, as I stated, “all over the map.”

    I appreciate your explaining the issue with respect to the testing method – mandatory is mandatory and, I agree, there’s no point in worrying about it. I only questioned the methods since your first post left me with the impression you were just trying various things on your own.

    So, taking your first and second posts together I think this is where you are.

    1.       Your process can be multimodal for a variety of reasons.

    2.       You don’t care about any of this since the spec is off in the west 40 somewhere and no cares about the humps and bumps.

    3.       For whatever reason(s) identifying the sources of special cause variation impacting your test setup is not permitted.

    4.       When you tried to control for some variables you though might drive multi-modality you still wound up with a bi-modal distribution in your experimental data.

    5.       For the controlled study the two groups of product were distinctly different and both had a narrow distribution.

    6.       You want to find some way to ignore/bypass the bi-modal nature of the controlled series of builds and come up with some way to use the data from the controlled build to generate an estimate of ordinary variation you can use for your TMV.

    The easiest way to use your test data to attempt to get some kind of estimate of ordinary variation suitable for a TMV would be to go back to the data, identify which data points went with which mode, assign a dummy variable to the data points for each of the modes (say the number 1 for all of the data points associated with the first hump in the bi-modal distribution and number 2 for all of the data points in the second), and then run a regression of the measured properties against the two levels of the dummy variable.

    Take the residuals from this regression and check them to see if they exhibit approximate normality (fat pencil test).  I’d also recommend plotting them against the predicted values and look at these residual plots just to make sure there isn’t some additional odd behavior.

    If the normal probability plot indicates the residuals are acceptably normal, and if the residual plots don’t show anything odd, then you will take the residuals and compute their associated variability and use this variability as an estimate of the ordinary variability of your process.

    The reason you can do this is because by regressing the data against the dummy variable you have removed the variation associated with the existence of the two peaks and what you will have left, in the form of the residuals, is data that has been correctly adjusted for bi-modality.

    …and now the caveats

    1.       You only have the results from the one controlled build – you are assuming the result of a second build will not only remain bi-modal but that the spread around the residuals from the analysis of data from the new build will not be significantly different from the first series.

    2.       All you have done with the dummy variable regression is back out the bi-modal aspect – you have no idea if the residuals are hiding other sources of special cause variation, you have no idea how those unknown special causes may have impacted the spread of the two modes, and that means you have no guarantees of what you might see the next time you try to repeat your controlled analysis.

    3.       If there is nothing else you can do then, as a simple matter of protecting yourself, I would recommend you insist on running a series of the same controlled build experiment over a period of time (say at three- month intervals) for at least a year and see what you see. My guess is, even if you manage to somehow only have two modes each time you run the experiment, you are going to see some big changes in the residual variability from test-to-test.

    4.       If you should ever have to face a quality audit armed with only the results based on the above and the auditor is an industrial statistician you will have some explaining to do.

    0
    #246952

    Robert Butler
    Participant

    After a Google search I’m guessing that TMV stands for Test Method Validation.  If this is the case then I think you need to back up and re-think you situation.

    What follows is a very long winded discourse concerning your problem so, first things first. The short answer to your question is – no – running a TMV and pretending the bi-modal nature of your results does not matter is a fantastic way to go wrong with great assurance.

    You said you have what appears to be a multimodal distribution of your output but everything meets customer spec.  Based on this and on some other things you said it sounds like the main question is not one of off-spec but just one of wondering why you get multimodal results.

    1. You said,”We are now repeating TMV for that test, and due to its destructive nature, we must use ANOVA to determine %Tol. We also calculate STD ratio (max/min). 4 operators are required.”

    2. You also said,”For TMV we limited the build process ranges – one temp, one operator etc and we have a distinctly bimodal distribution (19 data points between 0.850 and .894 and 21 data points between 1.135 and 1.1.163) LSL is 0.500. Reduction to a unimodal distribution is not worth the expense from a process standpoint, and we wouldnt know how to do so, since it may be incoming materials causing this distribution (All are validated, verified, suppliers audited etc. Huge headache, huge expense to make changes.)”

    The whole point of test method validation is an assessment of such things as accuracy, precision, reproducibility, sensitivity, specificity, etc.  Since your product is all over the map and since your (second?) attempt at TMV gave results that were also all over the map for reasons unknown the data resulting from your attempt at TMV, as noted in #2 above, are of no value.  The data from #2 is not accurate and you cannot use that data to make any statements about test method precision.

    I would go back to #2 and do it again and I would check the following:

    1. Did I really have one temperature?

    2. Was my operator really skilled and did he/she actually follow the test method protocol.

    3. What about my “etc.” were all of those things really under control or if I couldn’t control them did I set up a study to randomize across those elements I thought might impact my results (shift change, in house temperature change, running on different lines, etc.)?

    4. It’s nice to know the suppliers of incoming raw material are “validated, verified, suppliers audited etc.” but that really isn’t the issue. The two main questions are:

    1) What does the lot-to-lot variation of all of those suppliers look like (both within a given supplier and, if two or more suppliers are selling you the “same” thing, across suppliers)?

    2) When you ran the TMV in #2 did you make sure all of the ingredients for the process came from the same lot of material and from the same supplier?

    I’m sure you have a situation where not all ingredients come from a single supplier but the question is this – for the TMV in #2 did you lock down the supplies for the various ingredients so that only one supplier and one specific lot from each of those suppliers was used in the study in #2?

    The reason for asking this is because I’ve seen far too many cases where the suppliers had jumped through all of the hoops but when it came down to looking at the process the “same” material from two different suppliers or even the “same” material from a single supplier was not, in fact, the “same”.  The end result for this lack of “sameness” is often the exact situation you describe – multimodal distributions of final product properties.

    A couple of questions/observations concerning point #1:

    1. I don’t see why you think you need ANOVA to analyze the data – nothing in your description would warrant limiting yourself to just this one method of analysis.

    2. You stated you needed 4 operators yet in #2 you said you were using one operator. As written both #1 and #2 are discussing TMV so why the difference in number of operators?

     

    0
    #246877

    Robert Butler
    Participant

    I would disagree with respect to your view of a histogram for delivery time.  I would insist it is just what you need to start thinking about ways to address your problem.

    Given: Based on your post – just-in-time (JIT) for everything is a non-starter.

    Under these circumstances the issue becomes one of minimization of the time spread for late delivery.

    You know 75% of the parts have a time spread of 0-6 days.  What is the time spread for 80,90,95, and 99%?  Since all parts are created equal and all that matters is delivery time then the focus should be on those parts whose delivery time is in excess of say 95% of the other parts.  Once you know which parts are in the upper 5% (we’re assuming the spread in late delivery time for 95% of the parts is the initial target) the issue becomes one of looking for ways to insure a delivery time for the 5% is less than or equal to whatever the upper limit in late delivery time is for the 95% majority.

    Having dealt with things like this in the past my guess is you will first have to really understand the issues surrounding the delivery times of the upper 5%.  I would also guess once you do this you will find all sorts of things you can change/modify to pull late delivery times for the upper 5% into the late delivery time range of the 95%.

    Once you have the upper 5% in “control” you can choose a new cut point for maximum late delivery time and repeat.  At some point you will most likely find a range of late delivery time which, while not JIT, is such that any attempt at further reduction in late delivery time will not be cost effective.

     

    2
    #246851

    Robert Butler
    Participant

    Found it.   It’s about 8 months back and it is the same question – the question also has other wrong answers as discussed in the thread.

    https://www.isixsigma.com/topic/lssbb-exam-question-help/

     

    0
    #246850

    Robert Butler
    Participant

    This is the same garbage question another poster asked about several months ago – the answer was wrong then and it is wrong now.

    Three factors at two levels is a 2**3 design = 8 points.  There is no such thing as a 32 factorial design.

    Since it is evening I’ll rummage around a bit and see if I can find the earlier post.  If I’m remembering correctly there are other wrong answers on that test.

     

    0
    #246570

    Robert Butler
    Participant

    It’s not a matter of calculating the t value and then transferring  it into a p-value.  What you are doing is computing a t statistic and then checking that to see if the value you get meets the test for significance.  Small t-values breed large p-values and conversely.  :-)

    You are doing what I recommended and your plot is telling you what you computed – there is a slight offset between the two groups of data which means that the averages are numerically different but there really is no statistically significant difference.  If you want an assessment of the means then you could turn the analysis around – treat the yes/no as categorical and run a one way ANOVA on the results.  You’ll get the mean values for the two choices and you will also get no significant difference.

    All of the above is based on commenting on the overall plot you have provided.  There are a couple of interesting things about your plot you should investigate. It is a small sample size and it looks like your lack of significance is driven  by 4 points – the two points at -1 that are much lower than the main group and the two points at 1 that are higher than the main group.

    As a check – remove those 4 points and run the analysis again – my guess is you will either have statistical significance (P < .05) or be very close.  If you do get significance, and if it is a situation where you were expecting significance, then you will want to go back to the data to see if there is anything out of the ordinary with respect to those 4 points.  If there is something that physically differentiates those points from their respective populations and if their deletion results in significance then I would recommend the following:

    1. Report out the findings with all of the data.

    2. Include the plot as part of the report.

    3. Emphasize the small size of the sample.

    4. Report the results with the deletion of the 4 points.

    5. Comment on what you have found to be different about the  4 points.

    6. Recommend additional samples be taken with an eye towards controlling for whatever it was that you found to be different about the 4 points and re-run the analysis with an increased sample size to see what you see.

    • This reply was modified 2 months, 2 weeks ago by Robert Butler. Reason: typo
    1
    #246525

    Robert Butler
    Participant

    For the casual reader who is late to this spectacle and who does not wish to plow through all 94 posts allow me to provide a summary.

    The OP has a pure theory, based on axioms and postulates of his choosing with no demonstrable connection to empirical evidence, which he has used to evaluate various and sundry aspects of MINITAB analysis and works by Montgomery.  His theory generates results that are wildly at odds with tried and tested methods of analysis as delineated by MINITAB and Montgomery.

    Through a series of polite and courteous posts it has been pointed out to the OP that extraordinary claims, such as his, require extraordinary proof and that the burden of providing this proof is on his shoulders and not on the shoulders of those whom he has insisted are incorrect.

    The OP’s response to this well understood tenet of scientific inquiry has been one continuous run of Proof-By-Volume consisting of raw, naked, ugly, disgusting, hateful, ad hominem attacks on anyone questioning his theory and the conclusions he has drawn from it. The OP has made it very clear he not only has no intention of providing such proof but he also does not think proof is necessary.

    His intransigence with respect to the idea of the need to provide extraordinary proof is mirrored in his choice of words. An assessment of his 51 posts (as of 5 March 2020) indicates his favorite words and word assemblies are (in no particular order)

    Wrong – 47 times – applied to everyone but himself

    Theory – 52 times – by far the favorite

    Ignorance, Incompetence and Presumptuousness – all together or at least one occurrence of any of the three words – 41 times  – again – as applied to everyone else

    As the most prolific responder to his posts I’ve tried to politely point out the shortcomings of his theory and why it isn’t viable.  It has been, as I thought it probably would be, a vain effort. The only thing I’ve received in return is a barrage of ad hominem attacks and multiple accusation of waffling (6 times) (to waffle – to be unable to make a decision, to talk a lot without giving any useful information or answers). Waffling – back at you OP.

    There really isn’t anything else to say – the OP has constructed a theory based on postulates and axioms of his choosing. He has reported the results of his theory as though they were correct while the empirical evidence he himself has compiled in this regard says otherwise, he has provided no proof of the correctness of his theory, and he reacts violently to any suggestion that his theory is wrong.  I’m sure the OP will want to have the last word, as well as the one following that – go for it Fausto  – QED (Quite Enough Discussion)

    1
    #246500

    Robert Butler
    Participant

    As I noted in my post to your thread yesterday your situation is exactly that described by Abraham Kaplan and you have confirmed this to be the case in your post where you specifically stated:

    “For me THEORY is

    ·              ·         The set of all LOGIC Deductions

    ·              ·         Drawn

    ·              ·         From Axioms and Postulates

    ·              ·         Which allow to go LOGICALLY from Hypotheses to Theses

    ·              ·         Such that ALL Intelligent and Sensible people CAN DRAW the SAME RESUL”

     

    So, to reiterate – what you have is a pure theory which has no basis in empirical fact and which is, as you are so fond of saying – wrong.  What is particularly interesting about your situation is that you have conclusively proved you are wrong and you have demonstrated that fact time and again.

    In the scientific world when one constructs what they believe to be a new theory the very first thing they do is make sure their theory is capable of accounting for all of the prior known facts in whatever area it is they are working.  Montgomery’s work, the methods and results of T Charts, etc. have been around for a long time and have withstood the close scrutiny and testing by myriads of scientists, engineers, statisticians, health practitioner, etc.  No one has found anything wrong with them.

    If a real scientist constructs a theory and finds it at odds with prior known facts the first thing that individual will do is choose to believe there is something wrong with the theory and spend time either trying to adjust the theory to match the known facts or, if that proves to be impossible, scrap the theory and start over.  In your case, you tested your theory and found it at odds with Montgomery et.al. and came to the instant conclusion your theory, based on a set of axioms and postulates of your choosing, was correct and everyone else was wrong.

    As Kaplan noted – the world your theory has deduced will only mirror reality to the extent that the premise mirrors reality. The empirical evidence you have compiled in your uploaded rants about MINITAB and Montgomery’s text repeatedly confirm the fact that your theory (and most certainly the postulates and axioms upon which it is based) does not mirror reality and is therefore wrong.

    Instead of recognizing this very obvious fact you have chosen to spin the failures of your theory as successes, indulge in self-glorification, and shout down and denigrate anyone and anything that challenges your personal view of yourself and your theory. This isn’t science. It is raw, naked, ugly, disgusting, hateful, ad hominem attacks of the lowest order – in other words – Proof By Volume. Mussolini would be so proud.

    As for your demands for my peer reviewed papers all I can say is let’s stay focused on the topic of this thread.  This thread was started by you. It is not about me nor anyone else who has participated. It is about a guy named Fausto Galetto and his pure theory based on axioms and postulates which do not incorporate empirical evidence. Let’s keep it that way.

    0
    #246473

    Robert Butler
    Participant

    In all the time I’ve been a participant on the Isixsigma Forums, this has to be the saddest thread I’ve seen or been a part of.

    Whether you wish to acknowledge it or not @fausto.galetto, this is your situation:

    From: The Conduct of Inquiry – Abraham Kaplan pp. 34-35

    “It is in the empirical component that science is differentiated from fantasy. An inner coherence, even strict self-consistency, may mark a delusional system as well as a scientific one. Paraphrasing Archimedes we may each of us declare, “Give me a premise to stand on and I will deduce a world!” But it will be a fantasy world except in so far as the premise gives it a measure of reality. And it is experience alone that gives us realistic premises.”

    You have a premise – your theory – and you deduced a world. And, based on your scornful rejection of @David007 recommendation for simulations as well as the lack of actual data from processes you personally have run and controlled, that is all you have – a theory bereft of empirical evidence.

    You wrote a paper describing the world resulting from your theory and you sent your description of that world to Quality Engineering. They gave your world some careful thought, compared it to the empirical evidence they had (as well as the current world view based on that evidence), found your world description wanting, and rejected it.

    Based on your posts it appears you did the same thing with MINITAB. They too, examined your world, found it did not agree with, nor adequately describe, the empirical evidence they had on hand, and chose to ignore it.

    All through the exchange of posts to this thread you have made it very clear that Fausto Galetto’s world based on pure theory is the only correct one and anyone who does not agree is ignorant, incompetent and presumptuous.  You, of course, have a right to believe these things but everyone else has a right to believe otherwise.

    What makes this exchange of posts particularly depressing is the idea you believe a company like MINITAB would know about a serious flaw in their T Chart routine and do nothing about it. You seem to forget that hospitals and medical facilities routinely use the MINITAB T Chart.  If there was something seriously wrong with that part of the program there would be ample empirical evidence in the form of lots of dead patients attesting to the fact: There aren’t.   If there had ever been anything seriously wrong with the T Chart program the statisticians at MINITAB would have made proper corrections to the program before it was ever offered in their statistics package.

    • This reply was modified 2 months, 3 weeks ago by Robert Butler. Reason: typo
    0
    #246438

    Robert Butler
    Participant

    @fausto.galetto – sorry, I’m with @David007 on this one.  Your latest response(s) to me (and to him) are true to form – invective and shouting.  I know this is just poking the Tar Baby and prolonging a very sad string of posts at the top of the masthead of a very good site, but you really should re-read what you wrote and contrast that with my last post. There is nothing in any of the 8 statements you cited that have anything to do with THEORY – as you are so fond of putting it.

    Given your violent reaction to anything you perceive as contrary to your views, I actually took some time to look over some of the numerous things you have uploaded to that storage site ResearchGate.  I also took some time to check out the reputation of that site. Three things are immediately evident:

    1. The noble concept of “open access” is DOA and you are taking the life of your grant and the status of your professional reputation/career in your hands if you try to use anything on those sites as a basis/starting point for your work.

    2. If the “paper” you sent to  Quality Engineering looked anything like the stuff you have uploaded to that storage site (open access with no apparent oversight) it is not surprising it was rejected.

    3. Based on what you have posted here and on ResearchGate it appears you strongly favor the PBV method of scientific discourse (Proof By Volume).  That method works well in the world of politics and extreme religious movements but does not advance your cause (and hopefully it never will) in the world of engineering and science.

    1
    #246422

    Robert Butler
    Participant

     

    The following statements – (I won’t bother to bold them since you have already)

    1. “There is no “hidden theory”. Joel Smith has a good paper on t charts …Control charts for Nonnormal data are well documented”

    2. “is your issue that the data is neither exponential nor Weibull, but something else?”

    3. “If you are saying the chart fails to detect assignable causes, try simulating some exponential data.”

    4. “But as I’ve repeatedly said with n=20 the model is going to be wrong anyway.”

    5. “You really should play with some simulated exponential data. You’ll be surprised by what you see as “inherent variation””

    6. “Minitab tech support is not needed. Calculate the scale (Exponential) or scale and shape (Weibull) using maximum likelihood. Compute .135 and 99.865 percentiles.”

    7. “I anxiously await to see your rebuttal paper in any of Quality Engineering, Journal of Quality Technology, Technometrics, Journal of Applied Statistics, etc.”

    8. “along with your spam practice”

    are not goads, are not impolite, and they most certainly are not “IGNORANCE, INCOMPETENCE and PRESUMPTUOUSNESS” as those words are generally understood.

    To goad is to provoke someone to stimulate some action or reaction – there is nothing provocative in any of the cited quotes. A rational assessment of the 8 quotes cited above is as follows: 1-6 are just plain statements of fact and/or polite questions concerning your take on the issue – nothing more.  #7 is a very polite and logical response to all of your posts that preceded it.  And #8 is an objective observation concerning your posting practice.

    In your numerous posts to this thread you have made it very clear you interpret anything that goes against your personal views as arising from “IGNORANCE, INCOMPETENCE and PRESUMPTUOUSNESS”. What you choose to ignore is the obvious fact that your personal views are not shared by the people you are addressing on this forum, by the body of practitioners of science/engineering/process control in general, nor by the body of reviewed and published scientific/engineering literature.

     

    1
    #246416

    Robert Butler
    Participant

    @fausto.galetto – In your post to @David007 on February 29 at 4:39 AM you spewed out a lot of hate and you concluded with the statement “Tell me your publications SO THAT I CAN READ them…. and I will come back to you and to your “””LIKERS”””!”

    @David007 has been courteous and polite in every response he has made to your postings.  Your responses to his posts are nothing but shouting and invective. They are, by turns, irrational, ugly, viscous, denigrating, and hateful.  They have no place on a forum of this type.

    1
    #246342

    Robert Butler
    Participant

    The problem you are describing has two parts.  The first part isn’t one of correlation rather it is one of agreement.  Things can be highly correlated but not be in agreement.  I would recommend you use the Bland-Altman approach to address the agreement part of the problem.

    Bland-Altman

    1. On a point by point basis take the differences between the two methods.

    2. You would like to be able to compare these differences to a true value but since that is rarely available use the grand mean of the differences. Compute the standard deviation of the grand mean of the differences and identify the +- 1,2, and 3 standard deviations

    3. Next compute the averages of the two measures that were used for computing each difference.

    4. Plot the differences (Y) against their respective average values (X).

    5. If you have a computer package that will allow you to run a kernel smoother on this data and draw the fitted line do so.  If you don’t then you can use a simple linear regression and regress Y against X and look at the significance of the slope.  If you can only do the simple linear fit you will want to plot that line on the graph.  The main reason for doing this is to make sure the significance (or lack thereof) isn’t due to a few extreme points.

    To determine agreement you will want to see if the mean of the differences is close to zero. If there is an offset, but no trending then you will have identified a bias in the measures (based on the example you have provided this is to be expected).  If you have a significant trending and it is not being driven by a few data points then you will have a situation where the two methods are not in agreement. If they are not in agreement then you have some major issues to address.

    The second part is the issue of the approximate delta of .25 relative to the lower (9.8) and upper (13.8) limits.

    It looks like you have the following possibilities:

    1. The difference is approximately .25 and both processes are inside the 9.8-13.8 range

    2. The difference is approximately .25 and one process is outside either the upper or lower target and the other process isn’t.

    3. The difference is approximately .25 and both processes are outside of either the upper or lower targets.

    Your post gives the impression that a single instance of the approximate delta of .25 being associated with either of the processes outside the targets counts as a fail.  If this is true then the question becomes one of asking what are the probabilities of getting a difference of approximately .25 given that one or both of the processes are outside the target range and building a decision tree based on what you find.

    0
    #246301

    Robert Butler
    Participant

    As you noted – given a big enough sample (or a small one for that matter), even from a package that generates random numbers based on an underlying normal distribution, there is an excellent chance you will fail one or more of the statistical tests… and that’s the problem – the tests are extremely sensitive to any deviation from perfect normality (too pointed a peak, too heavy tails, a couple of data points just a little way away from 3 std, etc.) which is why you should always plot your data on a normal probability plot (histograms are OK but they depend on binning and can be easily fooled) and look at how it behaves relative the perfect reference line.  Once you have the plot you should do a visual inspection of the data using what is often referred to as “the fat pencil” test – if you can cover the vast majority of the points with a fat pencil placed on top of the reference line (and, yes, it is a judgement call) it is reasonable to assume the data is acceptably normal and can be used for calculating things like the Cpk.

    I would recommend you calibrate your eyeballs by plotting different sized samples from a random number generator with an underlying distribution on normal probability paper to get some idea of just how odd this kind of data can look (you should also run a suite of tests on the generated data to see how often the data fails one or more of them).

    I can’t provide a citation for the hyper-sensitivity of the various tests but to get some idea of just how odd samples from a normal population can look I’d recommend borrowing a copy of Fitting Equations to Data – Daniel and Wood (mine is the 2nd edition) and look at Appendix 3A – Cumulative Distribution Plots of Random Normal Deviates.

    I would also recommend you look at normal probability plots of data from distributions such as the bimodal, exponential, log normal, and any other underlying distribution that might be of interest so that you will have a visual understanding of what normal plots of data from these distributions look like.

    0
    #246279

    Robert Butler
    Participant

    I think you will have to provide more information before anyone can offer much in the way of suggestions.  My understanding is you have two populations which have a specified (fixed) difference between them and you want to know if this fixed difference is less than or greater than the difference between the lowest and highest spec limits associated with the two populations.  If this is the case then all you have is an absolute comparison between two fixed numerical differences.  Under these circumstances there’s nothing to test – either the fixed delta is greater or less than the fixed delta between greatest and least spec and things like correlation have nothing to do with it.

    0
    #246276

    Robert Butler
    Participant

    If the measurements within an operator for each part are independent measures and not repeated measures then you could take the results across parts for each operator and regress the measurements against part type (you would have to construct a dummy variable for parts to do this).  The variability of the residuals from this regression would be a measure of the variation within each operator.

    If you then reversed the process and took all of the measurements for a single part across operators and did the same thing with operators as the predictor variable the variation of the residuals for that model would be a measure of the variability within a part.

    If you wanted to get an estimate of the process variation controlling for part type and operator you would take all of the data and build a regression with part and operator as the predictor variables.  The variation of the residuals would be an estimate of ordinary process variation with the effect of operators and parts removed.

    In all of these cases you would want to do a thorough graphical analysis of the residuals.  Since you would have the time stamp of each measurement you would want to include a plot of residuals over time as part of the residual analysis effort.  If you’ve never done residual analysis then I would recommend taking the time to read about how to do it.  My personal choice would be Chapter 3 of Draper and Smith’s book Applied Regression Analysis – 2nd edition – you should be able to get this book through inter-library loan.

    The big issue is that of operator measurements on the same part.  You can’t just give an operator a part and have him/her run a sequence of three successive measurements – these would be repeated measures not independent measures and the variation associated with them would be much less than the actual variation within an operator.  You would need to space the measurements of the parts out over some period of time and so that each measurement of a given part by each operator has some semblance of independence.

    0
    #246274

    Robert Butler
    Participant

    The yes/no is the X variable and the  percentage is the Y  therefore just code no = -1 and yes = 1 and regress the percentage against those two values.  As for the p-value the “traditional” choice for significance of a p-value is < .05 so, using that criteria a p of .127 says you don’t have a significant correlation between the yes/no and the percentage.  This would argue for the case that whatever is associated with yes/no is not having an impact on the percentages.

    Since correlation does not guarantee causation and causation will not guarantee you will find correlation what you need to do (you should do this in every instance anyway) is put your residuals through a wringer before concluding that nothing is happening.  You would want to plot the residuals against the predicted values as well as against the yes/no response.  If there are other things you know about the data (for instance, you know it was gathered over time and you have a time stamp for each piece of data) you will want to look at the data and the residuals against these variables as well.

    Since you had a good reason for suspecting a relationship a check of the residual patterns will help you find data behavior that might account for your lack of significance.  If the data structure is adversely impacting the regression you may see things like clusters of data, a few extreme data points, trends in yes/no choices that are non-random over time, etc. which are adversely influencing the correlation.

    If you should find such patterns you would want to identify the data points to which they correspond and re-run the regression. If the revised analysis results in statistical significance then you will need to go back to the process and try to identify the source of the influential data points. If it turns out there is something physically wrong with those data points you could justify eliminating them from the analysis and reporting your findings but you will also want to make sure that you clearly discuss this decision in your report.

     

    1
    #246212

    Robert Butler
    Participant

    There’s any number of ways you could do this.  You could run a regression with percentage as the Y and yes/no as the X.  You could use a two-sample t-test with the percentages corresponding to a “no” as one group and the percentages corresponding to a “yes” as the other group.  If the distribution of the percentages is crazy non-normal you might want to run the t-test and the Wilcoxon-Mann-Whitney test side by side to see if they agree (both either find or do not find a significant difference).

    0
    #246013

    Robert Butler
    Participant

    The short answer to your question is there isn’t a short answer to your question.  The explanation provided by Wikipedia is a good starting place but you will have to take some time to not only carefully read and understand what is being said but also put pencil to paper and work through some math.

    I’ve spent most of my professional career as an engineering statistician and biostatistician. I always keep paper and pencil handy when I’m reading a technical article and I take the time to convert what I think I have read into mathematical expressions.  I find when I do this I not only gain a better understanding of what I’ve read but, if I’ve misunderstood what was written, I find it is easier to see that I have misunderstood. This, in turn, helps me on my way to a correct understanding.

    Perhaps if you had written out, in mathematical form, the highlighted section of the quote in your initial post you would have realized your question “How can you have several squared deviations of mean? Square it and it is just (mean)(mean) …?” is incorrect. The highlighted segment does not say “squared deviations of mean” it says “several squared deviations FROM that mean”  In mathematical terms it is saying (x(i) – mean)*(x(i) – mean) where x(i) is an individual measurement. The difference between the previous squared expression and what you wrote is all the difference in the world.

    0
    #245896

    Robert Butler
    Participant

    The phrasing of several of the questions is terrible…and there is one that is simply wrong.  Now that we have that out of the way my thoughts on the questions are as follows:

    Question 46: I would choose B.  When you lay a t-square over the plot there is no way the minimum at 6 is anywhere near 100. If they are claiming D is the correct answer then I don’t see how that’s possible.  “Between 2-4” means anything between those two numbers so you would look at the values from 2.01 to 3.99 and, depending on which line you choose (regression line, 95% CI on means, 95% CI on individuals) you can get a 20-40 range.

    As an aside- if you are going to ask questions like this you owe it to the people taking the test to provide a graph that has enough detail to permit an assessment of questions such as this one – If I ever made a presentation where the required level of detail with respect to output was 10 units my investigators would be really upset if I gave them a graph with a Y axis in increments of 50.

    Question 49: I would choose D.  If the residuals can be described by a linear regression then the residual pattern is telling you that you have some unknown/uncontrolled variable that is impacting your process in a non-random fashion.  The good books on regression analysis will typically show examples of residual plots where the plot is telling you that you have something else you need to consider.  The three most common examples are residuals forming a straight line, residuals forming a curvilinear line, and residuals forming a < or a > pattern.

    Question 52: Of the group of choices C is correct but the question is terrible – I’ve never had a case where a manager asked me for the average cost of an item rather the issue was always the cost per item which means you calculate the prediction error based on the CI for individuals.  What they are doing is using the CI on means.  Thus for 3 standard deviations (the industry norm) you would have 4060 + 3*98/sqrt(35) = 4060 + 49.7 = 4109.7.  Your choice is wrong because the issue is not the average cost, rather it is the maximum average cost and a maximum average of 4200 means you are going to have a fair percentage of the average product costs in excess of 4200 – how much would depend on the distribution of the cost means relative to some grand mean.

    Question 152: I agree, the answer is B.  Please understand that what I’m about to say is not meant to be condescending.  You need to carefully re-read the question – RTY is a function of …..What?

    Question 184:  Your equations shouldn’t have the sqrt of 360 in them – the problem states the process standard deviation is 2.2 therefore that is not s it is sigma.  The issue is this – for the first equation the Z value is 1 and for the second it is -.681.  If you look at a Z table you will find for Z = 1 the area under the curve coming in from -infinity is .8413. If you subtract this value from 1 you get the estimate for the percentage of the production that are in excess of 8.4 and that is 15.87% and 15.87% of 360 = 57.13 which rounds down to their answer.

    However, there is the issue of the product outside the lower bound. This is determined by your second equation where Z = -.681. The area under the curve coming in from – infinity to -.681 is .2483. This means the total percentage of off spec (above the upper limit and below the lower limit) is .1587 + .2483 = .4070 which means the total off spec is .4070 x 360 = 146.

    Question 209:  The null hypothesis is that the mean is less than or equal to $4200 therefore the alternate hypothesis is that the mean is greater than $4200.  All of the other verbiage in the problem has nothing to do with the statement of the null and alternative hypothesis so I would agree with E.

    1
    #245682

    Robert Butler
    Participant

    The question you are asking is one of binary proportions. Since you have a sample probability for failure (p = r/n where r = number of failures and n = total number of trials) in the one environment you also have a measure of the standard deviation of that proportion  = sqrt([p*(1-p)]/n).  You also have a measure of the proportion of failures in a different environment where you took 40 samples and found 0 failures.  You can use this information to compare the two proportions as well as determine the number of samples needed to make statements concerning the odds that the failure rate in the new environment is zero and the degree of certainty that the failure rate is 0.

    Rather than try to provide an adequate summary of the mathematics needed to do this I would recommend you look up the subject of proportion comparison and sample size calculations for estimates of differences in proportions.  I can’t point to anything on the web but I can recommend two books that cover the subject – you should be able to get both of these through inter-library loan.

    1. An Introduction to Medical Statistics – 3rd Edition – Bland

    2. Statistical Methods for Rates and Proportions – 3rd Edition – Fleiss, Levin, Paik

    0
    #245559

    Robert Butler
    Participant

    @cseider – …could be… Happy New Year to you too.   :-)

    0
    #245555

    Robert Butler
    Participant

    The original focus of this exchange was the issue of the validity of the underpinnings of the T-chart.  Your conjecture is that you have developed a theory and this theory is correct and the, apparently to you, unknown theory behind the T-chart is wrong.  Your approach to “proving” this is to

    1. Ask for a bunch of MBB’s to provide a solution to problem(s) you have posed with the understood assumption that if they don’t exactly reproduce your results then they are wrong.

    2. Demand that the folks at Minitab provide the underlying theory of the T chart and along the way prove you are wrong.

    As I said in an earlier post to this exchange – extraordinary  claims require extraordinary proof and the burden of the proof is on the individual making the claims not on the people against whom the claims are made. I sincerely doubt you will hear anything from Minitab for the simple reason that, to the best of their knowledge, (as well as to the best of the knowledge of anyone who might be using their T-chart program for process control) what they have is just fine.

    It really doesn’t matter what you think about whatever theory you have built nor does it matter that you think you have done whatever it is that you have done correctly.  The issue is this – do you have a “reasonable” amount of actual data where you can conclusively show the following:

    1. Specific instances where the Minitab T-chart control declared a process to be in control only to find that it really wasn’t in control and when your approach was applied to the same data, your approach identified the out-of-control situation.

    2. Specific instances where the Minitab T-chart control declared a process in control, it was found to be in control, and when the data was analyzed with your approach you too found it to be in control.

    3. Instances where your approach declared a process to be out-of-control only to find later it was not out-of-control and, when the data was re-analyzed with the Minitab T-chart, their methods identified the process as in control.

    4. Instances where your approach declared the process to be in control only to find it was out-of-control and when checked with the Minitab approach the out-of-control situation was correctly identified.

    5. …and, once everything is tallied – what kind of results do you have – a meaningful improvement in correct identification of out-of-control situations when dealing with processes needing a T-chart (while simultaneously guarding against increases in false positives) or just some kind of change that, in the long run, is at best, no better than what is currently in use.

    You have made a big point about citing some of Deming’s statements concerning theory.  Although I can’t recall a specific quote, one big point he made was the need for real data before you did anything else.  His book Quality, Productivity, and the Competitive Position is essentially a monument to that point.  What you have presented on this forum is a theory bereft of evidence – just claiming everyone else is wrong because it violates your personal theory is not evidence.  The kind of evidence you need is as listed in the four points above.  If you have that kind of evidence and if what you have appears to be a genuine improvement,  then, as I’ve said before, the proper venue for presentation is to a peer reviewed journal that deals in such matters.

    In this light, your rebuttal concerning “thousands” is without merit.  The point of this thread was that the Minitab T-chart method was wrong and nothing more and it was in this context that you made the claim that “thousands” of MBB’s were wrong.  The “proof” you provided in your most recent thread is based on a claim that the entire Minitab analysis package is wrong. In addition, in an attempt to inflate the “thousands” estimate, you dragged in poor old Montgomery and declared his book has BIG errors.  Statements such as these are not proof – they are just simple gainsaying.

    Now, what to do about the rest of your last post?

    1.  Your rejoinder concerning the Riemann Hypothesis is standard misdirection boilerplate – the fact remains that if your proof had had any merit it would have made the pages of at least one of the journals on mathematics.

    2. How many people have I seen who have admitted a mistake in a publication?  Quite a few – you can find corrections and outright retractions with respect to papers in peer reviewed journals if you take the time to go looking for them.  One good aspect of the peer review process is that the vast majority of mistakes are found, and corrected, before the paper sees publication.

    I happen to know this is true because I do peer reviewing for the statistics sections of papers submitted to one of the scientific journals. I can’t tell you the number of mistakes in statistical analysis I have found in submitted papers.  As part of the review process what I do is point out the mistake(s), provide sufficient citations/instructions concerning the correct method to be used, and request the authors re-run the analysis in the manner indicated.  The length of the written recommendations varies.  The longest I think I’ve ever written ran to almost two full pages of text which were accompanied by attachments in the form of scans of specific pages in some statistical texts.  In that instance the authors re-ran the analysis as requested, were able to adjust the reported findings (most of the significant findings did not need to be changed, a couple needed to be dropped, and, most importantly, they found a few things they didn’t know they had which made the paper even better), addressed the concerns of the other reviewers and their paper was accepted for publication.

    3.  In my career as a statistician, more than 20 of those years were spent as an engineering statistician in industry supporting industrial research, quality improvement, process control, exploratory research, and much more.  I too did not have time to send articles in for peer review and I too made presentations at conferences.  Some of those presentations were picked up by various peer reviewed journals and, after some re-writing for purposes of condensation/clarification saw the light of day as a published paper. It was because of my experience that I asked about yours.  I thought the question was particularly relevant in light of your position concerning your certainty about the correctness of your efforts.

    4. As for reading you archived paper on peer review – no thanks.  I seem to recall when I checked you had uploaded more than one paper on the subject of peer review to that site and, if memory serves me correctly, all of them had titles suggesting the text was going to be nothing more than a running complaint about the process.

    The peer review process is not perfect and can be very stressful and frustrating – I know this from personal experience.  Sometimes the intransigence of the reviewer is enough to drive a person to thoughts of giving up their profession and becoming an itinerant  beachcomber.  However, based on everything I’ve experienced, the process has far more pluses than minuses and, given the inherent restrictions of time/money/effort of the process, I have yet to see anything that would be a marked improvement.

     

     

    0
    #245538

    Robert Butler
    Participant

    Oh well, in for a penny in for a pound.

    @fausto.galetto – great – so now you have completely contradicted your denial of the comments I made in my first post.

    I said the following:

    /////

    1. A quick look at the file you have attached strongly suggests you are trying to market your consulting services – this is a violation of the TOS of this site.

    2. If you want to challenge the authors of the books/papers you have cited, the proper approach would be for you to submit your efforts to a peer reviewed journal specializing in process/quality control.

    /////

    To which you responded:

    /////

    Dear Robert Butler,

    NO CONSULTING SERVICES TO BE SOLD!

    I am looking for solutions provided by the Master Black Belts (professionals of SIX SIGMA).

    I do not want to challenge the authors!!!

    THEY did not provide the solution!!!!!!!!!!!!!!

    I repeat:

    I am looking for solutions provided by the Master Black Belts (professionals of SIX SIGMA).

    /////

    …and what followed was a series of posts that demonstrated you were indeed challenging authors and you were not looking for solutions since, by your own admission, you had already solved them and you thought they were wrong and you were right.

    So today you post what amounts to an advertisement for your services/training. True, you don’t specifically encourage the reader to purchase anything but the only reference is to you and your publication and the supposed correctness of your approach.

    A couple of asides:

    1. You say “Since then THOUSANDS of MASTER Black Belts have been unable to solve the cases.”

    a. And you determined this count of “Thousands” how?

    b. Why would the fact that “Thousands” of MBB’s have been unable to solve the cases in the way you think they should solve it matter?

    c. Given you are challenging the the approach of various authors of books and papers what makes you think your version is correct?

    d. I only ask “c” because one can find out on the web your proof of the “so-called Riemann Hypothesis” – your words – submitted to some general non-peer reviewed archive on 5 October 2018 which was incorrect (in a follow up article to the same archive you said it was “a very stupid error”). It would appear, even with this correction the proof was still wrong because, according to Science News for 24 May 2019, the proof of the Riemann Hypothesis (or conjecture) is still in doubt.  The article states “Ken Ono and colleagues” are working on an approach and a summary of their most recent efforts can be found in the Proceedings of the National Academy of Sciences.

    2. It’s nice that Juran praised one of your papers at a symposium – the big question is this – which peer reviewed journal published the work?

    0
    #245473

    Robert Butler
    Participant

    What an IMPRESSIVE retort!!!  @fausto.galetto DO YOU appreciate the INEFFABLE TWADDLE of this entire thread (see, I can yell, insult, and boldface type too) ?

    Let’s do a recap:

    1. OP initial post says he wants a solution for two items in an attached document.

    2. I do a quick skim and offer the comment that I think the correct venue for the OP would be a peer reviewed paper since it certainly looked like a challenge to the authors/papers cited.

    3. The OP responds and assures me this is not the case – all he wants is for some generic six sigma master black belts to provide a solution. Given what follows later it is obvious he doesn’t want a solution – just confirmation of his views.

    4. @Darth offers a suggestion and a recommendation to try running Minitab – and in the following OP post @Darth gets slapped down for his suggestions.

    5. Reluctantly, the OP announces he has downloaded Minitab “BECAUSE NOBODY tird to solve the cases”

    6. The OP then lays into Minitab on this forum because “MInitab does not provide the Theory for T Chart….

    BETTER I DID NOT FIND in Minitab the THEORY!!! IF someone knows it PLEASE provide INFORMATION”

    7. @Darth tries again with a Minitab analysis and provides a graph of the results – the OP slaps him down again.

    8. The OP shifts gears – suddenly, somehow, the OP knows “MINITAB makes a WRONG analysis: the Process is OUT OF CONTROL!!!!!!!!!!!!!!!!!! I asked to Minitab the THEORY of T Charts!!!!!

    9. The OP then slams me for stating he hasn’t tried to solve the cases. OK, fine, I’ll assume he did try. I probably should have said the OP hadn’t tried to do any research with respect to finding the theory of T control charts or really understanding anything about them other than insist Minitab drop everything and get back to him with the demanded information pronto! – but this really doesn’t matter because, as one can see in the other posts, the OP KNOWS Minitab is wrong because he has a proof based on THEORY!  What theory he doesn’t say but apparently it doesn’t matter since this THEORY is better than the THEORY of T Charts – even though, by his own admission, he doesn’t know what that theory is.

    Given the bombast I decided to see what I could find.  I couldn’t find a single peer reviewed paper by the OP listed in either Pubmed or Jstor.  Granted there are other venues but these two happen to be open to the public and are reasonably extensive.

    I did find a list of papers the OP has presented/published at various times and I found some vanity press (SPS) – publish on demand books by the OP – of the group I found the title of his book  “The Six Sigma HOAX versus the Golden Integral Quality Approach LEGACY”  interesting because the title, as it appears on the illustrated book cover on Amazon, is written in exactly the same manner as the OP’s postings.  Given the book title I find it odd that the OP would ask anyone on a Six Sigma site for help. After all, with a title like that one would assume the OP views practitioners of Six Sigma as little more than frauds and con artists.

    In summary – the OP believes he has found Minitab to be in error and he wants some generic master black belts to confirm his belief.  I do know the folks at Minitab really know statistics and process control. I also know many of the statisticians at Minitab have numerous papers in the peer-reviewed press on statistics and process control and have made presentations at more technical meetings than I could list.  Under those circumstances, if the OP really thinks he is right and Minitab is wrong then the OP first needs to take the time to find, and thoroughly understand, the theory behind the T chart – this would require actually researching the topic – books, peer reviewed papers, etc.

    In science, extraordinary claims require extraordinary proof and the burden of the proof is on the challenger, not on those being challenged.  If, after doing some extensive research,  the OP is still convinced he is right then the proper venue for a challenge of this kind is publication in a peer reviewed journal that addresses issues of this sort.

    2
    #245469

    Robert Butler
    Participant

    To answer your question:

    There are two equations one for alpha and one for beta.

    What you need to define is what constitutes a critical difference between any two populations (max value, min value)

    What you have is the sample size per population (8) and for the conditions you have specified you have -1.645 (for the alpha = .05) and +1.282 ( for beta = .1).

    the alpha equation is  (Yc – Max value)/(2/sqrt(n)) = -1.645

    the beta equation is (Yc – Min value)/(2/sqrt(n)) = 1.282

    subtract the two equations and solve for n.

    Obviously you can turn these equations around – plug in n = 8 and solve for the beta value and look it up on a cumulative standardized normal distribution function table and see what you have.

    A point of order:  As I mentioned earlier ANOVA is robust with respect to non-normality. Question: with 8 samples how did you determine the samples were really non-normal?  If you used one of the many mathematical tests for non-normality – don’t.  Those tests are so sensitive to non-normality that you can get significant non-normality declarations from them when using data that has been generated by a random number generator with an underlying normal distribution.  What you want to do is plot the values on probability paper and look at the plot – use the “fat pencil test” – a visual check to see how well the data points form a straight line.

    If you are interested in seeing just how odd various sized random samples from a normal distribution look I would recommend you get a copy of Daniel and Wood’s book Fitting Equations to Data through inter-library loan and look at the plots they provide on pages 34-43 (in second edition – page numbers may differ for later editions)

    1
    #245009

    Robert Butler
    Participant

    your post is confusing.  What I think you are saying is the following:

    You have 13 groups and each group has 8 samples.

    The data is non-normal

    You checked the data using box-plots and didn’t see any outliers

    You chose the Kruskal-Wallis test because you had more than 2 groups to compare.

    You ran this test and found a significant p-value.

    If the above is correct then there are a few things to consider:

    1. You don’t detect outliers using a boxplot. The term outlier with respect to a boxplot is not the same thing as a data point whose location relative to the main distribution is suspect.

    2. ANOVA is robust to non-normality – run ANOVA and see if you get the same thing – namely a significant p-value.  If you do get a significant p-value then the two tests are telling you the same thing.

    The issue you are left with – regardless of the chosen test is this – which group or perhaps groups are significantly different from the rest. In ANOVA you can run a comparison of everything against everything and use the Tukey-Kramer adjustment to correct for multiple comparisons.  In the case of the K-W you would use Dunn’s test for the multiple comparisons to determine the specific group differences.

    If you don’t have access to Dunn’s test you will have to run a sequence of Wilcoxon-Mann-Whitney tests and use a Bonferroni correction for the multiple comparisons.

    0
    #244938

    Robert Butler
    Participant

    I agree nobody, including yourself, tried to solve the cases.  As I’m sure others will tell you, the usual procedure on this site is for you to try to solve the problem, post a reasonable summary of your efforts on a thread and then ask for help/suggestions.

    @Darth pointed you in a direction and gave you a reference.  Given what he offered your approach should have been to get a copy of the book referenced, learn about G and T charts, and do the work manually. Agreed it would have taken more time than if you had a program handy but by doing things manually (at least once) you would have learned a great deal and also solved your problem.

    0
    #244910

    Robert Butler
    Participant

    Based on your post it sounds like you have no idea where your process is or what to expect.  If that is the case then the first thing to do is just run some samples and see what you see – how many to run will be a function of the time/money/effort you are permitted to use for what will amount to a pilot study whose sole purpose will be to let you know where your process is at the moment and, assuming all is well, an initial estimate of the kind of variation you can expect under conditions of ordinary variation.

     

    0
    #244899

    Robert Butler
    Participant

     

    During the time I provided statistical support to R&D groups my method of support went something like this:

    1. Have the researchers fill me in on what they were trying to do.

    2. Walk me through their initial efforts with respect to quick checks of concepts, ideas – this would usually include the description of the  problem as  well as providing me with their initial data.

    3. Ask them, based on what they had uncovered, what they thought the key variables might be with respect to doing whatever it was they were trying to do.

    4. Take their data, and my notes and look at what they had given me and using graphing and simple regression summaries see if what they had told me was supported in some ways by my analysis of their data.

    5. If everything seemed to be in order I would then recommend building a basic main effects design.  As part of that effort I would ask them for their opinions on combinations of variables they thought would give them what they were looking for.  I would make sure the design space covered the levels of the “sure fire” experiments and, if they were few enough in number (and in vast majority of cases they were) I would include those experiments as additional points in the design matrix.

    6. Once we ran the experiments I would run the analysis with and without the “sure fire” experiments and report out my findings.  Part of the reporting would include using the basic predictive equations I had developed to run “what if” analysis looking for the combination of variables that gave the best trade off with respect to the various desired outcomes (in most cases there were at least 10 outcomes of interest and the odds of hitting the “best” for all of them was very small).  Usually, we would have a meeting with all of the outcome equations in a program and my engineers/scientists would ask for predictions based on a number of independent variable combinations.

    7. If it looked like the desired optimum was inside the design box we would test some of the predictions by running the experimental combinations that had been identified as providing the best trade-off.

    8. If it looked like the desired optimum was outside the design box we would use the predictive equations to identify the directions we should be looking.  In those cases when it looked like what we wanted was outside of where we had looked my engineers/scientists would want to think about the findings and, usually with my involvement, run a small series of one-at-a-time experimental checks.  Often I was able to take these one-at-a-time experiments and combine them with the design effort just to see what happened to the predictive equations.

    9. If they were satisfied that the design really was pointing in the right direction we would usually go back to #5 and try again.

    10.  If the results of the analysis of the DOE didn’t turn up much of anything it was disappointing but, since we had run a design which happened to include the “sure fire” experiments we would discuss what we had done, and if no one had anything to add it meant the engineers/scientists had to go back and re-think things. However, they were going back with the knowledge that the odds were high that something other than what they had included in the design was the key to the problem and that it was their job to try to figure out what it/they were.

    Point #10 isn’t often mentioned but, in my experience it is often a game changer.  The people working on a research problem are good – the company wouldn’t have hired them if they thought otherwise.  Because they are good and because they have had successes in the past the situation will sometimes occur where the research group gets split into factions of the kind we-know-what-we-are-doing-and-you-don’t.  There’s nothing like a near total failure to get everyone back on the same team – especially when all of the “known” “sure-fire” experiments have been run and found wanting.

    2
    #244849

    Robert Butler
    Participant

    A quick Google search indicates push, pull, and conwip are all variations on a theme – basically they focus on having work in progress throughout the line (CONWIP = Constant Work in Progress).  The same Google search indicates line balancing is just the practice of dividing the production line tasks into what are called equal portions with the idea that labor idle time is reduced to a minimum. Based on this quick search it would seem the issue is which flavor of production control do you want to emphasize rather than wondering about integrating one with the other because all of them appear to have the same goal – minimum worker idle time and maximum throughput.

    0
    #244538

    Robert Butler
    Participant

    All of the cases of which I’m aware of that have run simulations of the results of an experimental design have had actual response measurements from which to extrapolate.  If the simulation is not grounded in fact then all your simulation is going to be is some very expensive science fiction.

    A few questions/observations and an approach to assessing what you have that might be of some value.

    1.       You said you identified the critical factors using a number of tools.  In order for those tools to work you had to have actual measured responses to go along with them.

    a.       What kind of factors – continuous, categorical, nominal? A mix of all three? Or, what?

    b.       Apparently, you have some sort of measure or types of measures you view as correlating with record accuracy. What kind of measurements are these – simple binary – correct/incorrect? Some kind of ordered scale – correct, incorrect but no worries, incorrect but may or may not matter, incorrect and some worries, seriously incorrect? Or is the accuracy measure some kind of percentage or other measure that would meet the traditional definition of a continuous variable?

    2.       You said running an experimental design would entail cost.  All experimental efforts cost money. The question you need to ask is this:  Given that running a design would cost money and given that I have taken the time to generate a reasonable estimate of this cost – what is the trade off?  Specifically, if I were to spend this money, how much could I expect to gain if I identified a way to increase accuracy by X percent?  We are, of course, assuming the accuracy issue is physically/financially meaningful and one where the change in percent would also be physically/financially meaningful.   I make note of this because if you are at some high level of accuracy like 99% then you can only hope for some fraction of a percent change and the question would be what would a tiny fraction of a change in the final 1% translate into with respect dollars saved/gained?

    If you refuse to run even a simple main effects design (you should note that one does not need to interrupt the process in order to do this) then you are left with the happenstance data you gathered and tested using the tools you mentioned in your initial post.

    In this case you could do the following: Take the block of data you gathered and check the matrix of critical factors you have identified (the X’s) for acceptable degrees of independence from one another.  The usual way to do this is to use eigenvalues and their associated condition indices and run a backward elimination on the X’s dropping the X with the highest condition index and then re-running the matrix on the reduced X matrix.  You would continue this process until you have a group of X’s that are “sufficiently” independent of one another within that block of data. An often-used criteria for this measure of sufficiency is having all remaining X variables with a condition index < 10.  If you are interested, the book Regression Diagnostics by Belsley, Kuh, and Welsch has the details.

    What this buys you is the following: you will know which X’s WITHIN the block of data you are using are sufficiently independent of one another.  What you WON’T KNOW and can never know is the confounding of these variables with any process variables not included in the block of data and the confounding of these variables with unknown variables that might have been changing when the data you are using was gathered.  You will also need to remember your reduced list of critical factors will likely fail to include variables you know to be important to your process.  This failure will probably be due to the fact that variables known to be important to a process are being controlled which means the variability they could contribute to the block of data you have gathered is not significant – in short you will have a case of causation with no correlation.

    Keeping in mind these caveats, you can take your existing block of data and build a multivariable model using those critical factors you have identified as being sufficiently independent of one another. Run a backward elimination regression on the outcome measures using this subgroup of factors to generate your reduced model. Test the reduced model in the usual fashion (residual analysis, goodness of fit – and make sure you do this by examining residual plots – Regression Analysis by Example by Chatterjee and Price has the details). Take your reduced model, apply it to your process and see what you get.  Before you attempt to run the process using your equation, you will want to make sure the signs of the coefficients make physical sense and you will need to make sure everyone understands the model is based on happenstance data and failure of the model to actually identify better practices is to be expected.

    0
    #244328

    Robert Butler
    Participant

    1. A quick look at the file you have attached strongly suggests you are trying to market your consulting services – this is a violation of the TOS of this site.

    2. If you want to challenge the authors of the books/papers you have cited, the proper approach would be for you to submit your efforts to a peer reviewed journal specializing in process/quality control.

    0
    #244140

    Robert Butler
    Participant

    I think you will need to provide more information before anyone can offer much in the way of suggestions.

    1. You said, “The data we have received are sales numbers of 2016-2017-2018 accompanied with purchase number and of course the Sales price and purchase price.”

    2. You also said, “We have to come up with a demand -forecasting model + re-order point for a company.”

    Question: What is the level of detail of the sales numbers?  Monthly/weekly summaries? Actual sale-by-sale reporting for each of the three items and the day they were sold? Or…something else?

    Question: What is the criteria for the need to re-order? X number of items 1,2,3 in non-working inventory, JIT delivery with a known lag time for meeting JIT requirements? Or…something else?

    Question: Regardless of the level of detail, since it is time series data have you plotted the data against time?  If not, why not? If you have, what did you see?  Demand cycling? Steady state sales? etc.

    If you can provide answers to these questions perhaps I or someone else might be able to offer some advice.

     

    0
    #243809

    Robert Butler
    Participant

    In re-reading your comments it occurred to me there is an often overlooked but critical aspect to the issue of variable assessment using the methods of experimental design which should be mentioned.

    In my experience, the worry that one might miss an important interaction when making choices concerning experiments to be included in a design is very often countered by the kind of people for whom the design was run. You should remember, for the most part, the people you are working with are the best your company/organization could get their hands on and they are very good at what they do.  When you give people of this caliber designed data you are presenting them with the following:

    1. Because the data is design data they are guaranteed correlations between the variables and the outcomes actually reflect correlations which are independent of one another.  This means they do not have to worry about the inter-correlation relationships between the independent variables because they do not exist.

    2. Because the data is design data they do not have to worry about some hidden/lurking/unknown variable showing up in the guise of one of the variables that were included in the study (you did randomize the runs – right?).

    3. Because the data is design data if a variable does not exhibit a significant correlation with an outcome they are guaranteed, for the the range of the variable examined, the odds of that variable mattering are very low.

    Given the above, and the insight these people have with respect to whatever process is under consideration, what you have is a variation on the saying about card playing – “a peek is as good as a trump.”  In other words the results of the analysis of this kind of data coupled with the knowledge your people bring to the table will very often make up for any shortcomings that were, of necessity, built into the original design.

    What this means is if there is an interaction that was overlooked in the design construction their prior knowledge, coupled with design data will often result in the discovery of the untested interaction.  This kind of a finding will usually occur during the discussion of the design analysis when the results of the analysis get paired with their knowledge about process performance.  If and when this happens it has been often been the case that I was able to take the original design matrix and augment the runs with a couple of additional experiments. These experiments, when added to what was already run, allowed an investigation of the original terms along with an investigation of the “overlooked” interaction.

    Note: If you run an augmentation to an existing design you will want to include one or two design points from the original as a check to make sure things haven’t changed substantially since the time of the initial design run.

    0
    #243712

    Robert Butler
    Participant

    I’m not trying to be mean spirited or snarky but it’s called “experimental” design and not “infallible” design for a reason.  :-) Seriously, this is a standard problem and one every engineer/scientist/MD faces.  Your choices of variables to examine are, of necessity, based on what you know or think you know about a process and the number of experiments you run are usually determined by factors such as time/money/effort required.

    The approach I use when building designs for my engineers/scientists/MD’s etc. is to first spend time looking over what data they might have (this is usually a sit down meeting) with respect to the issue at hand and discuss not only what we can see from plots and simple univariate analysis of the data but also what they, as professionals, think might need addressing.

    In the majority of cases the final first design I build for them will be a modified screen with enough runs added to allow for a check for specific two-way interactions their experience suggests might be present.  In addition, if they think one or more of the variables might result in curvilinear behavior I’ll make sure to include enough design points to check that as well.

    Oh yes, one additional item – from time-to-time someone (or a group of someone’s) will have in mind an experiment that they are just positive is really the answer to the problem.  Usually these “sure-fire” experiments are few in number (the sure-fire counts  of 1 and 2 are quite popular) so if they fall inside the variable matrix I will add them to the list of experiments to be run as well (and if they don’t fall inside then my first question is: Why haven’t we extended the ranges of the variables of interest to make sure we have covered the region of the sure-fire experiments?).

    As for building a modified screen of this type my usual procedure is to use D-optimal methods to identify the experimental runs.  What you need to do is build a full matrix for all mains, two-way, and curvilinear effects, force the machine to accept a highly fractionated design for the mains, and then tell the machine to augment this design with the minimum number of points needed for the specific two-way and curvilinear effects of interest.  After you have this design then you can toss in the sure-fire points (if they exist) and you will be ready to start experimenting.

    • This reply was modified 5 months, 3 weeks ago by Robert Butler. Reason: typo
    0
    #243697

    Robert Butler
    Participant

    Since it sounds like the issue is that of identifying the “odd” line plot and given that the line plot you have shown is typical it would seem that a better approach would be to use the data you have to identify what looks like a “normal” envelope and then flag any curve that falls outside of that envelope.

    Just looking at the graph it would appear you have the ability to identify an acceptable range of starting points (minimum to maximum X)  and if you take a slice through the thickest part of the plot where the graphs roll over it looks like you can define a range of acceptability there as well (minimum vs maximum Y).

    A way to check this would be to take existing data – identify the unacceptable curves and see where they fall relative to what I assume is the “envelope” of acceptability.

    0
    #243684

    Robert Butler
    Participant

    My understanding of your post is that you have a large number of histograms which summarize elapsed time measurements.  You have taken these histograms and run a curve fitting routine to try to draw the contour of the histogram.  You have taken these contour plots and overlaid them and are now confronted with trying determine curves that tend to follow a certain shape.

    If the above is a fair description of what you are doing then a much easier approach to your problem would be a series of side-by-side boxplots.  They will give you the same information, you won’t have to worry about colors because all of the groups of data will be present.  Once you have these plots side-by-side it is an easy matter to spot similarities and differences in the boxplot patterns (you will want a boxplot routine that allows an overlay of the actual data points on the boxplot) and quickly assess such things as clustering, means, medians, standard deviations, etc.

    Attached is an example to give you some idea of what I’m describing.  The similarities and differences are readily apparent.  For example, the elapsed times for 2 and 5 years look very similar as do those for 3 and 6 whereas 4 stands alone.

    Attachments:
    1. You must be signed in to download files.
    0
    #243640

    Robert Butler
    Participant

    @jazzchuck -that’s very interesting.  I guess I have mixed feelings about automating the dummy variable.  Question:  Does Minitab take the time to explain to you somewhere in the output what it has done for you and why?  If it doesn’t and you try running a regression using nominal variables in a package that doesn’t automatically generate a dummy variable matrix you are really going to go wrong with great assurance.  This might also explain why I have the sense that more people are just dumping nominal variable levels in a regression package, generating stuff that doesn’t make any sense and then telling me in front of a bunch of people how statistical analysis really won’t be of much value in terms of analyzing the problems “we” deal with.  :-)

     

    0
    #243631

    Robert Butler
    Participant

    You can run a regression with the variables mentioned.  You need to remember the notion that regression variables (either X’s or Y’s) need to be continuous is just standard issue boiler plate and is overly restrictive.  The reason it is offered in many courses is to make sure you don’t go wrong with great assurance given that all you know about regression is what you learned in some two week course.

    For the binary variables you can just code them as -1,1 and go from there.  For the nominal you will have to set up dummy variables.  What this amounts to is picking one of the nominal settings as the reference and then coding the others accordingly.  So,for example, if you had a 3 level nominal variable you would code one of them as (d1=0, d2 =0) and the other two would be coded (d1 = 1, d2 = 0) and (d1 = 0, d2 = 1) respectively and then you would run the regression using d1 and d2 in place of the nominal settings.

    If you need a more thorough discussion of the subject I would recommend getting a copy of Regression Analysis by Example by Chatterjee and Price through inter-library loan and review the chapter on this subject.

    0
    #243104

    Robert Butler
    Participant

    I don’t understand what you mean by “simulate”.

    If you just want to play games with the variation around a .271 or a .300 batting average over time the easiest thing to do is take a thousand samples – in the first case 271 1’s and 729 0’s and in the second case 300 1’s and 700 0’s.  Put them in two columns with 271 1’s followed by 729 0’s (or the other way around – it doesn’t matter) and the same for the 300 and 700.  Then find a random number generator function out on the web and have it generate a random sequence of integers from 1 to 1000 (sampling without replacement so you only get each integer once). Do this twice and then match the random column one-to-one with the 0’s and 1’s column and then sort the two columns by the random number column so the random numbers are in sequence from 1 to 1000.  This will result in a random draw for the 0’s and 1’s and you can take any sub-sample of any size (1-50,  35-220, 14-700, etc.) and compute the batting averages for those sub-samples and see what you see.

    1
    #242608

    Robert Butler
    Participant

    The old post cited below should provide some insight into the BOB, WOW issues.

    https://www.isixsigma.com/topic/shainin-and-classical-doe-a-comparison/

     

    0
    #242322

    Robert Butler
    Participant

    In other words, you have noticed some people are not interested in the results of a careful analysis and prefer to go with their gut reactions.  It’s been this way ever since homo sapiens emerged as a distinct species and, barring some natural selection pressure that favors logic over emotion, there’s not much you can do about it in the short term and, on an evolutionary time scale, you won’t be around when and if that kind of selection pressure arises and changes the species from homo sap to homo sensible.

    0
    #242305

    Robert Butler
    Participant

    @cseider – you had to look it up???!!!  What else could CCPS be except Capacitor Charging Power Supply????  A rather esoteric subject but one, I’m sure, that must be of intense interest to some members of the IsixSigma community.

    0
    #242050

    Robert Butler
    Participant

    For a paired frequency test you could have the students run a series of say 50 flips of a clean penny and then run a series of 50 spins of a clean penny to see if there is a significant difference in the odds of the penny coming up heads (or tails).

    0
    #242023

    Robert Butler
    Participant

    Well then, for heavens sake – as a favor to your mental well being – don’t ask it!  ;-)

    Seriously, I can’t decide if you are just venting, asking for civil ways to counter this question when it is put to you, or something else.

    3
    #241834

    Robert Butler
    Participant

    Well, there’s more to it than that.  It is correct that you cannot use the generic curvilinear expression for purposes of prediction because there isn’t any way to assign that response to either of the two variables you have.  This means, for purposes of prediction, the best you can do is an equation with mains and interaction.  However, you don’t just remove the term and forget about what it is telling you.

    If you are planning on using the equation you can construct for prediction purposes then there are several things you need to do in order to give yourself some understanding of the risks you are taking by using a predictive equation that has not accounted for all of the special cause variability in your data.

    1. Residual plots – look at the residuals against the predicted – how far away from the zero line are the data points associated with the center points?  If the distance is “large” (yes, this is a judgement call) then you could go wrong with great assurance.  On the other hand if the points associated with the center points are “close to” but still on one side or the other then you are still not accounting for all of the special cause but, if you can’t do any more with respect to a few augmentation runs, it would be worth testing the equation to see how well it performs.

    2. You need to scale and center your X’s in the manner I outlined in my first post and rerun the regression.  As I said, when you do this you will have a way to directly compare the effects of unit changes in the terms in the model.

    Just looking at your equation based on the raw numbers my guess is that the units of X1 and X2 are large which would account for the small coefficient for the interaction.  I’m not sure what to make of the large coefficient for the generic squared term.  If you re-run the analysis with the scaled X1 and X2 you will be able to directly assess the impact of choosing to ignore the existence of curvature.

     

    0
    #241697

    Robert Butler
    Participant

    There are two schools of thought concerning main effects and interactions – one school says if the interaction is significant you need to keep the main effects and the other one says you don’t.

    As for the curvature:

    a) I don’t see that term in your model

    b) Assuming the equation is a typo – the curvature is generic which means all you know is that it exists therefore you will need to augment your design with one or perhaps two more experimental runs to identify the variable associated with the curvilinear behavior.

    Something worth considering:

    Take your two variables and scale them from -1 to 1 and re-run the analysis.

    In order to do this you will need to find the actual minimum and maximum settings for each of the two variables and then do the following for each of the two variables.

    A = (maximum + minimum)/2

    B = (maximum – minimum)/2

    Scaled X1 = (X1 – A1)/B1

    Scaled X2 = (X2 – A2)/B2

    If you re-run the analysis using the scaled terms you will get an equation that will allow you to directly compare the effects of a unit change in X1, X2, the generic squared term, and X1*X2 interaction.  What this buys you is the ability to choose between the two methods of final terms in the model.

    For example, if the coefficients of the scaled X1 and X2 are small relative to the coefficient of X1*X2 then, since X1 and X2 alone are not significant nothing much will be lost in the way of model use.  On the other hand if they are equivalent then you might want to consider keeping them in the model.  Of course, all of this assumes you are also doing the usual things one should be doing with respect to assessing the differences in the two models – residual analysis, prediction accuracy, etc.

    0
    #241060

    Robert Butler
    Participant

    No, I’m afraid this discussion is not an example of a complete misunderstanding of a control chart – Wheeler and Chambers know their stuff.

    I agree with 5 data points I can’t apply all of the control chart tests but I can do this – I can get an estimate of the ordinary process variance by employing the method of successive differences, I can get an estimate of the overall mean, and I can make a judgement call with respect to the 5 data points I have as far as addressing the question of grossly out of control points.  When you have a process that generates 8 data points per run and when, at best, it is run twice a year, insisting on 30 data points before attempting an analysis means waiting a minimum of 2 years before doing anything – and that approach means you will have to run your process using the the old W.G.P G.A.* rule of thumb which is an excellent way to go wrong with great assurance.

    A check of the back of any basic book in statistics will usually reveal a t-table. An examination of that table indicates the minimum number of data points needed for a t-test is 2.  With two data points you can get an estimate of the mean and the variance and you can test that sample mean to see if it is statistically significantly different from a target.

    When you run a two level two variable design it is quite common to have a total of 5 experimental runs and it is also quite common to use the results of those five experiments to develop predictive equations for a process.  Once the equations have been built the first order of business is confirming their validity – typically this will consist of two or three process runs where the equations are used to identify optimum settings.  If the equations do a decent job of prediction (and they do this with appalling regularity) those equations will be incorporated in the methods used for process control.

    In short, the empirical rule of needing 30 data points before running a rough statistical analysis has no merit.

    *Wonder, Guess, Putter, Guess Again

     

    1
    #241055

    Robert Butler
    Participant

    As others have noted – more data points are better when it comes to assessing out of control and for all of the reasons already mentioned.  However, waiting until you have more (we will arbitrarily define “more” as at least 20 data points) presumes the time/money/effort needed to get more data is forthcoming.  I’ve worked on any number of processes where the total number of data points available for constructing a control chart to assess process control was 8 or less and the hope of getting any more data points in a “reasonable” period of time (say a few days or maybe a few weeks) was a forlorn one.

    These processes were batch type processes run on an as needed basis and “as needed” translated into elapsed times of 6 months or more. Since we needed to have an assessment of level of process control I used all of the usual methods for assessing variation in the presence of little data.  One of the big issues was that of the question of the in control nature of the first data point in the next cycle of needed production.  If it was “in control” we would run the process as set up.  If it was “out of control” we had no choice but to adjust the process. This was a decision that was not taken lightly because the effort required to adjust the process was involved, time consuming, and expensive.  What I found was that even with a control chart consisting of data from a single run I was able to use what I had to make accurate assessments of the process when we started running the next cycle of production.  Of course, once I had data from the second cycle it was immediately incorporated on a data-point-by-data-point basis and the chart was updated each time a new data point was added.

    0
    #240852

    Robert Butler
    Participant

    Well @Straydog and @Mike-Carnell that guy @chebetz has found us out!  I guess it’s time we ask Mike Cyger to tell Katie to change our screen names again. I think I’ll go with Dr. Theodore Sabbatini this time around.  :-)

    I can’t answer for Straydog or Mike but given our interactions over the years I’m pretty sure their answer will be the same as mine – specifically – no , the forum is not a day job and I don’t sit around waiting for the computer to signal me that someone has posted something new.

    One of my engineers pointed me to this site back in 2002.  I looked over some of the answers to basic statistics questions and couldn’t believe how wrong many of them were.  So, with vicious self-interest as a driving force, I started answering various and sundry stat questions.  I enjoy responding because this gives me an excellent chance to not only revisit some of my textbooks for details but also practice the art of explaining difficult concepts to non-statisticians.  I will usually drop by in the morning before starting work and again in the evening after work to see if there is anything interesting.  If there is and I decide to respond I’ll usually do that during lunch or, if it is short enough, during a coffee break.  If the response is going to be too long or if I want the think about it I will wait until after work to write a response.

    1
    #240841

    Robert Butler
    Participant

    I don’t see any reason why you can’t if you want to.  In most instances though the question will be: which customer spec limits are you going to include on your chart?  I worked in industry for many years before shifting over to medicine and I can’t recall a single product where we were making something for just one customer.

    There is also the issue of having customer specs that are well inside your control limits but, because of their location, the fact that they are is of no major consequence.  For example, let’s say you manufacture polychrome fuzzwarts and the control limits for this sought after product are +- 3 polycrinkles.  Customer #1 has a spec limit of -3 or less to -1 polycrinkles, customer #2 has a spec limit of 2 polycrinkles to 3 or more polycrinkles and everyone else is happy with anything between from -3 to + 3 polycrinkles.  True, you will have to do lot selection for two of the groups but, if the cost of such an action is less than the cost of shifting the process when orders come in from the two populations then there is a good chance you will leave the process alone.  Under such circumstances the inclusion of customer spec limits will just clutter the graph.

    1
    #240799

    Robert Butler
    Participant

    Something doesn’t seem right.  I would like to check the calculation. Could you post your total sample size and the values of the two averages?

    0
    #240669

    Robert Butler
    Participant

    Let’s see if we can’t get you off of dead center.

    You said the following:

    Given a set of data, how do you conduct a baseline performance analysis? Is this the same as a capability analysis?

    And, when looking at a set of data, how can you tell where the process changed?

    My assignment asks me to complete a baseline performance analysis and “if the process changed, you must select where it changed and compute your answers.” It later says “Divide up the data as separated in the last question.” What does this mean?

    1. Have you been given a data set to work with?

    2. Is what you posted all that you have been given in the way of a question or is your post a paraphrase of the actual problem?

    3. What is your understanding of the definition of the phrase “baseline performance” as it applies to a data set which (presumably) is supposed to represent process data acquired over some period of time?

    a. Given what you have posted, does the phrase “baseline performance” relate to the process mean, variance, median, IQR, or some other aspect or some combination of the properties listed?

    b. What kind of methods have you been taught to address questions concerning any of the items listed in a)?

    c. What kind of methods have you been taught to address changes in any of the items listed in a)?

    4. What is your understanding of what constitutes a capability analysis?

    5. What is your understanding of what a capability analysis evaluates?

    a. In any of your reading about capability analysis did any author conflate the phrases “capability analysis” and “baseline analysis?”

    b. Have you taken the time to check these two phrases using a Google search?

    1. If so, what did you find?

    If you can provide the answers to these questions you should have a partial answer to your original question and if you post your answers here perhaps I or some other poster may be able to offer additional assistance.

    1
    #240548

    Robert Butler
    Participant

    I’ve been away from any computer contact for the last 10 days, however, Happy belated birthday Mike… 19 July…hmmm interesting day that….

    Circus Maximus catches fire – 64

    15-year-old Lady Jane Grey deposed as England’s Queen after 9 days -1553

    Astronomer Johannes Kepler has an epiphany and records his first thoughts of what is to become his theory of the geometrical basis of the universe while teaching in Graz -1595

    5 people are hanged for witchcraft in Salem – 1692

    HMS Beagle with Charles Darwin arrives in Ascension Island – 1836

    1st railroad reaches Kansas – 1860

    Franco-Prussian war begins – 1870

    0
    #240371

    Robert Butler
    Participant

    I guess what I find most disturbing about the statement “these are actual questions from the exams” is the very real possibility of failing someone because they DID know the material and, conversely, passing someone who was ignorant of the correct answer.

    0
    #240357

    Robert Butler
    Participant

    In this instance the question does not make sense.  The usual way one describes Type I and Type II error is as follows:

    Type I error is the risk of rejecting something when it is true.

    Type II error is the risk of accepting something when it is false.

    So the risk of rejecting something when it is false doesn’t meet the definitions of either Type I or Type II error and Type III error is providing the right answer to the wrong question.

    In this case the null is: No improvement in the process occurred

    The alternate is hypothesis is : Improvement in the process occurred.

    So if improvement did occur in the process then the null was false and we correctly rejected it and we did not commit either Type I or Type II error.

     

    0
    #240341

    Robert Butler
    Participant

    It looks to me like they’ve screwed up again.  Answer C is the estimate of the standard deviation (24/6).  Your calculations are correct.

    0
    #240320

    Robert Butler
    Participant

    I’m sure they do. Here’s the issue

    1. What exactly is a 32 factorial design? I’m not aware of any reference that expresses design nomenclature in this manner.

    The usual practice is to say say things such as the following:

    1. I have a two level design with 3 factors
    2. I have a full 2 level factorial design for 5 variables for a total run of 32 experiments.
    3. I have a saturated design of 32 runs to investigate 31 variables in 32 experiments.

    As for the claim concerning the correctness of B – just look at the graph – the greatest tool age occurs with all 3 variables at their highest settings. The only way you could justify the claim would be if you had an interaction plot showing the effects of cutting speed and metal hardness and on THAT graph you had a situation where the combination of low cutting speed and high metal hardness resulted in greater tool aging. You cannot deduce answer B from an examination of those main effects plots.

    You are welcome to ask them but, given my experience with such entities it will be nothing more than contrivindum non pissandum.

    • This reply was modified 10 months, 3 weeks ago by Robert Butler. Reason: typo
    0
    #240316

    Robert Butler
    Participant

    I would say C,D,E and I think C is a typo – my guess is that they meant to have 3*2 = 6 experiments.

    Actually, there is another possibility – but if it is the case then the problem is poorly presented and C still remains a typo. Since they do claim it is a factorial design one could take a saturated 3 variable in 4 experiments design and augment it with two additional runs. This would still be a factorial design in 6 experiments but not if we are abiding by the standard definition (which I’m assuming they are) of a factorial design. So, if they mean a factorial design in the usual sense, then it would be a 2**3 design which still means C is a typo.

    Neither A or B are correct.

    • This reply was modified 10 months, 3 weeks ago by Robert Butler.
    • This reply was modified 10 months, 3 weeks ago by Robert Butler.
    0
    #240269

    Robert Butler
    Participant

    I think you will need to provide some additional details.

    1. In your first post you are talking about weight and in your second post you are talking about height – which is it?
    2. In your first post you will need to clarify what you mean by “I have a normal distribution for a coating weight per part of 1 gram/m2 and a standard deviation of 0.01 g/m2.”
    a. As written it could be taken to mean you have a normal distribution with an average of 1 gram/m2 and a long term population sample deviation of .01 gm/m2 or it could mean you have a part that is 1 gram/m2 which comes from a population that has a population standard deviation of .01 gm/m2. There is also the question of the standard deviation – is it a long term population standard deviation or is it a long term standard deviation of the mean (in which case we will need to know the sample size used to generate the estimate of the standard deviation of the mean).

    If we assume the 1 gm/m2 represents the long term population mean and if we assume the standard deviation is an estimate of the population mean then the chances of getting a part, never mind a random collection of 100 parts, with an average value of .95 is nil because 3 x the standard deviation is only going to span a range of .97 to 1.03.

    If you can provide more information perhaps I or someone else might be able to offer some suggestions.

    0
    #240187

    Robert Butler
    Participant

    No, it’s not a case of the less you have of repeated measures the less precision you have on the value of the standard deviation of the GRR rather it is a matter of the less you have of repeated measures the better the precision of your estimate of the true value of the standard deviation of the GRR. As I said – repeated measures variation will underestimate the actual standard deviation and that, in turn, will adversely impact your conclusions concerning GRR.

    1
    #240184

    Robert Butler
    Participant

    In addition to what @cseider said you also do not have independent measurements – you have repeated measures on a single item which means your estimates of variation are going to be much less than the real variation you can expect in your process.

    0
    #240005

    Robert Butler
    Participant

    I’m not trying to be funny at your expense but if you really have a process like this then then next Nobel Prize in Physics will be yours on demand because you have found a process that exhibits zero variation which means you have invalidated the second law of thermodynamics – something that heretofore was deemed physically impossible.

    Seriously, it appears you have a measurement system that is completely inadequate with respect to detecting changes in your process. This, in turn, eliminates any possibility of computing process capability. I realize you said in your post to the other thread you started that you couldn’t afford a better measuring device but what I don’t understand is how, with the same measuring device, you could go from the plots you provided in the other thread to the plots you provided in this one.

    If both of these plots are of the same process with the only difference between the two sets of data being some kind of “correction” then I would first want to re-examine the correction process because whatever it is doing is wrong.

    One additional thought – in both of your threads you do not give any sense of the elapsed time for taking the measurements you have illustrated in your plots. I assumed your first plots in the other thread represented independent samples taken over a “reasonable” period of time (at a minimum multiple days). If the time interval is very short then, not only are you at risk of not having independent samples but, if it is short enough it may explain the odd plot you just provided – i.e. the interval was so short that what variation was exhibited was indeed too small to be detected by your measurement device.

    1
    #239995

    Robert Butler
    Participant

    I would recommend you re-examine your data. The IMR plot gives a visual impression of three distinct shifts downward in your data. The first cut point occurs at about data point 44, and the second at about data point 83. It also gives the impression the process is drifting back toward the target around data point 93. In short, the plot would suggest you don’t have the process in control. One of the requirements of a capability study is that the data needs to be from a stable process.

    Before worrying about issues of normality or capability I would want to go back and see if I could identify the reasons for what appears to be special cause variation and change the process so that the data is exhibiting only ordinary process variation. In the interim, if you want to use the data to give yourself a measure of expected ordinary variation (and also check to see if the visual appearance really is signalling the presence of special cause variation) you could generate an estimate of ordinary variability using the method of successive differences.

    1
    #239875

    Robert Butler
    Participant

    I’m not trying to be sarcastic but who is making the claim that linear plots are mostly used in [an examination?] of a linear regression?

    There are any number of reasons why an analysis might result in employing simple linear plots. For example:
    1. Your plan for analysis involves running a two level factorial design. – In this case, by definition, a simple linear plot is all you are going to get.
    2. A plot of your production data says the the trend is linear – therefore if you run a regression on the data linear is going to be the best fit.
    3. Even though your knowledge of the process says the relationship between your independent and dependent variables is some sort of curvilinear process, the region where you happen to be running your process is a region where the relationship between the independent and dependent variables are essentially linear.

    If you are going to look at curvilinear trending then, at the very minimum, you need to have 3 levels for your independent variables. One of the reasons you run center points on two-level factorial designs (we are assuming the X’s are or can be treated as continuous) is to check for possible deviations from linear. If you find this to be the case then one will often augment the existing design to check for some kind of curvilinear response (usually quadratic/logarithmic)and your final plots will be curvilinear in nature.

    As for cubic (or higher) – the first question you need to ask is when would you expect to encounter such behavior? If you are actually working on something and plots of your production data do suggest you have a real cubic response then you would want to build a design with 4 levels for the various continuous independent variables.

    In all of the years of process analysis the only time I ever encountered a genuine cubic response was when we found out the region we were examining resulted in a phase change of the output (we were adding a rubberized compound to a plastic mix and we crossed the 50% threshold which meant we went from rubberized plastic to plasticized rubber). In short, in my experience, real cubic and higher curvilinear responses are extremely rare.

    There is, of course, data that is cyclic in nature such as monthly changes in demand for a product and I suppose you could try to model data of this type using linear regression methods but the first line of defense when attacking this kind of data is time series analysis and a discussion of that tool is, I think, beyond the focus of your question.

    0
    #239801

    Robert Butler
    Participant

    I see I spoke too soon. What I said above still applies with respect to the graphical presentation and sampling plan you should use when repeating your test but since you provided the actual data there are a couple of other things to consider.

    Again, if we pretend the sequential measures within a given operator are independent (they’re not) then for all of the data we find there is a significant difference between shift means 1817 and 2826 (P < .0001) and we also find there is a significant difference in shift variability with standard deviations of 193 and 98 respectively (P = .016).

    A quick check of the boxplots suggests the single measure of 2280 in the first shift is the cause for the differences in standard deviations. If we re-run the analysis with that value removed what we get are means of 1784 and 2826 – still significantly different (P < .0001) and standard deviations of 150 and 98 (P = .12). So, overall you first shift has a significantly lower mean scoop measure and a numerically higher (but not significantly different) variability.

    You could go ahead and try to parse the data down to the individual operator but with just five repeated measures per person I wouldn’t recommend doing this for two reasons 1) the measures within each operator are not independent and 2) even if the measures were independent it is highly unlikely that the variability of a sequential 5-in-a-row sample is at all representative of the variability one would see within a given operator over the course of a shift.

    I would recommend summarizing your findings in the form of the graph below along with a few simple summary statistics of the means and the variances and then, before trying to do anything else – address and solve the issues you mentioned in your second post and then re-run the sampling plan in the form I mentioned above.

    Attachments:
    1. You must be signed in to download files.
    0
    #239799

    Robert Butler
    Participant

    I don’t know Minitab but I do know they have boxplots graphics which I would recommend you use. You would have side-by-side boxplots of the two shifts. What you will want to do is to make boxplots which also shows the raw data that way you will not only have the boxplot with its summary statistics but you will be able to see visually just how the data distributes itself with each shift. See Attached for example.

    For your repetition you will want to take five measurements but not back-to-back – rather randomize the pick through the time of the shift. This will mean extra work for you but it will guarantee that your measurements are representative of what is happening in the course of a shift.

    Attachments:
    1. You must be signed in to download files.
    0
    #239789

    Robert Butler
    Participant

    Just a visual impression of your data says you have a statistically significant difference in mean scoop size between first and second shift. If we check it formally and pretend the measures within each team member can be treated as independent measures then what you have is 1.22 vs 1.88 (which, given the limit on significant digits reduces to 1.2 vs 1.9) with corresponding standard deviations of .168 and .172 (which rounds to .17 and .17). There is a significant difference in the means (P <.0001) and there is no significant difference (P = .92) with respect to shift variability.

    One thing you should consider. The above assumes the measures within each team member are independent. If the measures you have reported are random draws from a single shift work for each team member then it is probably reasonable to treat the measures as independent within a given member. On the other hand, if the measures were just 5 scoops in sequence then you are looking at repeated measures which means your estimates for shift variance are much smaller than they really are.

    Before you do anything else you need to understand the cause of the difference in scoop means between shifts. If the material is uniform, the front end loaders are the same, the 6 individuals all have the same level of skill in handling a front end loader and all other things of importance to the process are equal between the two shifts then you shouldn’t have a difference like the one you have in your data.

    • This reply was modified 11 months, 2 weeks ago by Robert Butler.
    0
    #239709

    Robert Butler
    Participant

    Wellll…not counting the mandatory military service I don’t think there’s been any time in my career when I wasn’t involved in process improvement. I started off as a physicist working in advanced optics – the work was a combination of R&D and improving existing systems/processes. One of the groups I worked with had an assigned statistician. I watched what he was doing and liked what I saw so I went back to school and earned my third piece of paper in statistics.

    My role as a statistician in industry was to provide support for R&D and process improvement in specialty chemicals, aerospace, medical measurement systems, and plastics. At my last company our major customer demanded we have six sigma black belts working on their projects. Our management picked three of us to go for training and I was one of them. The end result was I wound up running projects as well as providing statistical support to anyone who needed it.

    Mr. Bin Laden’s actions a few year back got me (along with a number of other people) laid off from that aerospace company so I went job hunting, found a listing for a biostatistician at a hospital, applied, and was accepted. It’s a research hospital but, just like industry, there are process improvement problems so, as before, I work on both R&D and process improvement issues.

    2
    #239658

    Robert Butler
    Participant

    You’re welcome. I hope some of this will help you solve your problem.

    0
    #239629

    Robert Butler
    Participant

    I don’t know the basis for your statement “I know that in order to be able to do the Hypothesis Testing we must also be able to test for Normality and the “t-test”. but it is completely false. There is no prerequisite connection between the desire to test a random hypothesis and ANY specific statistical test. Rather, it is a case of formulating a hypothesis, gathering data you believe will address that hypothesis and then deciding which statistical test or perhaps group of tests you should use to address/test your initial question/hypothesis.

    If you have run afoul of the notion that data must be normally distributed before you can run a t-test – that notion too is wrong. The t-test is quite robust when it comes to non-normal data (see pages 52-58 The Design and Analysis of Industrial Experiments 2nd Edition – Davies). If the data gets to be “crazy” non-normal then the t-test may indicate a lack of significant difference in means when one does exist. In that case check the data using the Wilcoxon-Mann-Whitney test.

    As to what constitutes “crazy” non-normal – that is something you should take the time to investigate. The best way to do that would be to generate histograms of your data, and run both the t-test and the Wilcoxon tests side-by-side and see what you see. You can experiment by generating your own distributions. If you do this what you will see is that the data can get very non-normal and, if the means are different, both the t-test and the Wilcoxon will indicate they are significantly different (they won’t necessarily have the same p-value but in both cases the p-value will meet your criteria for significance).

    With regard to the data you have posted – one of the assumptions of a two sample t-test is that the data samples be independent. Your data strongly suggests you do not have independent samples. As posted, what you have are a group of 7 officials who have tried something using an old and a new method. The smallest unit of independence is the individual official. Therefore, if you want to use a t-test to check this data what you will have to do is use a paired t-test.

    The paired t-test takes the differences within each individual official between the old and the new method and asks the question: Is the mean of the DIFFERENCES significantly different from 0. What this will tell you is, within the group of officials you have tested, was the overall difference in their performances significantly changed – which is to say is the distribution of the DIFFERENCES really different from 0.

    If you want to ask the question – is method new really better than method old you will have to take measurements from a group of officials running the old method, get a second group of officials and have them run just the new method and then run a two sample t-test on the two distributions. If, as I suspect, these methods require training you will need to make sure the two independent groups of officials have the same level of training in both methods.

    0
    #239598

    Robert Butler
    Participant

    As Kaplan put it in his book The Conduct of Inquiry, ” When attacking a problem the good scientist will utilize anything that suggests itself as a weapon.” :-)

    1
    #239447

    Robert Butler
    Participant

    If we couple what you said in your last post with an assessment of the graphs you have generated I think you might have the answer to your question.

    As I said previously, the normal probability plots of cycle time days by department is a very important graph. What it shows is that the cycle time data for 4 out of the 5 departments is distinctly bimodal. When you examine the bimodality what you find for Finance, Sales, and Product Plan is a clear break in the distributions of reports with low cycle times and the distribution of the reports with high cycle times. The break occurs around 4-5 days. The Prod Dept, probably by virtue of the large report volume relative to the other 3, does have an obvious bimodal distribution with a break around 5 days. One could argue that the Prod Dept plot is tri-modal with a distribution from 1-3 days, a second from 3-6 days and a third with 6 days or more, however, given the data we have I wouldn’t have a problem with calling the Prod Dept data bimodal with a break between low and high cycle time reports at around 5.

    The Exploration Department is unique not only because its report cycle time data is not bimodal but also because it appears that only one report has a cycle time of greater than 5 days.

    What the normal probability plots are telling you is whatever is driving long report cycle times is independent of department. This is very important because it says the problem is NOT department specific, rather it is some aspect of the report generating process across or external to the departments.

    Now, from your last post you said, “Data for this report is received from the production department. Out of this data, 18 were clean while 50 were dirty…” If I bean count the data points on the normal probability plot for Prod Dept for the times of 5 days or less I get 18. This could be just dumb luck but it bears investigation.

    For each department – find the reports connected with dirty data and the reports connected with clean data – data error type does not matter. If the presence of dirty data is predominately associated with reports in the high cycle time distribution (as it certainly seems to be with the Prod Dept data) and if, it turns out that the Exploration Dept data has no connection (or, perhaps a single connection with the 6 day cycle time report) then I think you will have identified a major cause of increased report cycle time.

    If this turns out to be true and someone still wants a hypothesis test and a p-value the check would be to take all of the reports for all of the departments, class them as either arising from clean or dirty data, and then class them as either low or high cycle times depending on which of the two distributions with each of the departments they belong (which is which will depend on the choice of cut point between the two distributions – I would just make it 5 days for everyone – report cycle time <=5 = low and >5 = high). This will give you a 2×2 table with cycle time (low/high) on one side of the table and data dirty/clean on the other. If the driver of the longer cycle times for reports is actually the difference between clean and dirty data then you should see a significant association.

    1
    #239440

    Robert Butler
    Participant

    There’s no need to apologize. You have an interesting problem and I just hope that I will be able to offer some assistance. To that end, I may be the one that needs to apologize. I was re-reading all of your attachments and, in light of what you have in the most recent attachment, I’m concerned that my understanding of your process is incorrect.

    My understanding was you had 5 departments in the CD group and that any one of the 5 could produce any one of the 8 types of reports and could expect to deal with any one of the four types of data errors. In reviewing the normal probability plots of the departments it occurred to me that there is a large difference in data points for the probability plot for the production department when compared to all of the others. This, coupled with the large count for Prod Volume reports is the reason for my concern about a dependence between department and report type.

    My other assumption was that the counts for the types of error occurrence was “large” which is to say counts in the realm of >25, however, your most recent attachment indicates the boxplots for the types of error are summaries of very small groups of data.

    So, before we go any further – are the assumptions I’ve made above correct or are they just so much fantasy and fiction?
    In particular:
    1. What is the story concerning department and report generation?
    2. Do any of the departments depend on output from the other departments to generate their reports?
    3. What is the actual count for the 4 error categories
    4. Why is it that production department has so many more data entries on the normal probability plot than any of the other departments?

    If I am wrong (and I think I am) please back up and give me a correct description of the relationships between departments, report generation, and types of error a given department can expect to encounter.

    One thing to bear in mind, the graphs you have generated are graphs that needed to be generated in order to understand your data so if a change in thinking is needed the graphs will still inform any additional discussion.

    0
    #239431

    Robert Butler
    Participant

    As @Straydog has noted references to standards in measurements has a very long history. The link below suggests referring to such checks as MSA occurred sometime in the 1970’s

    https://www.q-das.com/fileadmin/mediamanager/PIQ-Artikel/History_Measurement-System-Analysis.pdf

    1
    #239427

    Robert Butler
    Participant

    A few general comments/clarifications before we proceed
    1. You said “I guess I was looking at the p-value because I know that would determine the kind of test I run eventually.” I assume this statement is based on something you read – whoever offered it is wrong – p-values have nothing to do with test choices.
    2. When I made the comment about the boxplots and raw data what I meant was the actual graphical display – what you have are just the boxplots. Most plotting packages will allow you to show on the graph both the boxplot and the raw data used to generate the boxplots and it was this kind of graphical presentation I was referencing.
    3. Cycle time is indeed continuous so no problems there and your departments/report types etc. are the categories you want to use to parse the cycle times to look for connections between category classification and cycle time.

    Steps 1- 3

    Overall Data Type

    Your first boxplot of data types – no surprises there – however keep this plot and be prepared to use it in your report – it provides a quick visual justification for focusing on dirty data and not wasting time with DR data

    Error Type

    The graph says the two big time killers are, in order, duplicated data (median 7.1) and aggregated data (looks like mean and median are the same 4).

    Report Type
    The biggest consumers of time, in order, are Prod Volume (median 5.995), Balances (median 5.79) and Demand Volume (median 4.7). So from a strictly univariate perspective of time these would be the first place to look. However, this approach assumes equality with respect to report volumes. If there are large discrepancies in the report output count then these might not be the big hitters. What you should do is build a double Y graph with the boxplot categories on the X axis, the cycle time on the left side Y axis (as you have it now) and report volume on the right-hand Y axis. This will give you a clear understanding both time expended and quantity. If you don’t have a double Y plotting capability, rank order the boxplots in order of number of generated reports and insert the count either above or below the corresponding boxplot. Knowing both the median time required and the report volume will identify the major contributors to increased time as a function of report type.

    Department Type

    Same plot as before and I have nothing more to add as far as this single plot is concerned (more on this later).

    Normality plots

    These tell you a lot. With exception of the exploration department all of the other departments have a bimodal distribution of their cycle times. Distribution ranges: Finance – 1-4 and 5-7 days Prod Plan – <1 – 4 and 6-8 days, Sales <1-3 and approx. 4.5 – 5 days. Prod department may have a tri-modal distribution 0-3, 3- <6, and 6-9 days

    Based on your graphs you know the following:
    1. For each department you know the data corresponding to the greatest expenditures of time. For Finance it is the 5-7 day data, for Prod Plan it is the 6-8 day data, for sales it is the 4.5-5 day data, and for prod department it is the 6-9 day data.
    2. You know the two biggest types of error with respect to time loss.
    3. Given your graphical analysis of the report types and volumes you will have identified the top contributors to time loss in this category.

    Your first statistical check:
    Since you are champing at the bit to run a statistical test here’s your first. Bean count the number of time entries in each of the 4 error types. Note – this is just a count of entries in each category, the actual time is not important.,
    1. Call the counts from aggregates and duplicate data error group 1 and the counts from categories and details group two. Run a proportions test to make sure that the number of entries in group 1 are greater than the number of entries in error group 2. (This should make whomever is demanding p-values happy)

    Given that there really are more entries in aggregate and duplicate and given that you know from your graphical analysis which of the report types are the big hitters, pareto chart those reports, take the top two or three, and then run your second check by grouping the reports into two categories – the top three (report group 1) and all of the others (report group 2).

    Now run your second check on proportions – a 2×2 table with error groups vs report groups. This will tell you the distribution of error groups across report groups. You may or may not see a significant difference in proportions.

    Either way – what you now want to do is take the two report groups and the two error groups and look at their proportions for the two regions for Finance, Prod Plan, Sales, and Prod Department. What you are looking for is a differentiation between the short and long times for each department as a function of report type and error type. To appease your bosses you can run proportions tests on these numbers as well.

    If the error types and the report types are similar across departments for the longer elapsed times then you will know where to focus your efforts with respect to error reduction and report streamlining. If they are quite different across departments then you will have more work to do. If there isn’t a commonality across departments I would recommend summarizing what you have found and reporting it to the powers that be because this kind of finding will probably mean you will have to spend more time understanding the differences between departments and what specific errors/reports are causing problems in one department vs another.

    2
    #239424

    Robert Butler
    Participant

    The histograms and the boxplots suggest some interesting possibilities. Based on what you have posted I’m assuming your computer package will not let you have a boxplot which includes the raw data. On the other hand, if there is a sub-command to allow this you really need to have a plot that shows both.

    If my understanding of your process is correct namely the process flow is Draft Report and then it goes to Cleanse Data the histograms suggest a couple of possibilities:
    1. The Draft Report (DR) phase is completely independent of the Cleanse Data (CD) phase
    2. The folks in the DR phase are ignoring all of the problems with respect to data and are just dumping everything on the CD group.

    Of course, if it is the other way around – first you need to cleanse the data and then you draft a report then what you are looking at is the fact that, overall, the CD group did their job well and DR gets all of the benefits of their work and kudos for moving so quickly.

    My take on the histograms and the comment on the Multivar analysis is that for all intents and purposes both DR and CD have reasonably symmetric distributions. What I see is a very broad distribution for CD. I don’t see where there is any kind of skew to the left worth discussing.

    To me the boxplots indicate two things:
    1. Production Planning has the largest variation in cycle time
    2. Production Department has the highest mean and median times and appears to have some extreme data points on the low end of the cycle time range.

    My next step would be the following:
    1. If you can re-run the boxplots with the raw data do so and see what you see.
    2. Plot the data for all 5 of the CD departments on a normal probability paper and see what I can see. These plots will help identify things like multimodality.
    3 The boxplots suggest to me that the first thing to do is look at the product department and product planning.
    4. Go back to the product department data and identify the data associated with the extremely low cycle times and see if there is any way to contrast those data points with what appears to be the bulk of the data from the product department.
    a. Remove those extreme data points and re-run the boxplot comparison across departments just to see what you see.
    5. Take a hard look at what both prod department and product planning do – are they stand alone (I doubt this)? Do they depend on output from other departments before they can begin their work? If they do require data from other departments what is it and what is the quality of that incoming work? Do they (in particular product department) have to do a lot of rework because of mistakes from other departments? etc.

    The reason for the above statements is because the boxplots suggest those two departments have the biggest issues the first because of greater cycle time and the second because of the broad range. The other thing to keep in mind is that prod planning has cycle times all over the map – why? This goes back to what I said in my first post – if you can figure out why prod planning has such a spread you may be able to find the causes and eliminate them. There is, of course, the possibility that if you do this for the one department what you find may carry over to the other departments as well.

    1
    #239405

    Robert Butler
    Participant

    If your post is an accurate description of your approach to the problem then what you are doing is wrong and you need to stop and take the time to actually analyze your data in a proper statistical manner.

    The first rule in data analysis is plot your data. If you don’t know what your data looks like then you have no idea what any of the tests you are running are telling you.

    So, from the top, I’d recommend doing the following:

    You said, “I have … collected data for 2 main activities within the process for analysis – the cycle times for ‘draft report’ and ‘cleanse data’ 9with information on report type, department, error type, etc.). I did a multi-vari analysis which indicates that a large family of variations are from the ‘cleanse data’ activity.”

    Run the following plots:

    First – side by side boxplots of cycle times (be sure to select the option that plots the raw data and not just the boxplots themselves) split by activity. Do the same thing with overlaid histograms of cycle times by activity type.
    What this buys you;
    1. An assurance that the “large family of variations from the cleanse data” really is really a large family of variation and not something that is due to one or more of the following: A small group of influential data points, multi-modalities within cleanse data and/or within draft report whose overall effect is to give an impression of large variation within one group when, in fact, there really isn’t, some kind of clustering where one cluster is removed from the rest of the data, etc.

    Second – side by side boxplots within each of the two main categories for cycle time where the boxplots are a function of all of the things you listed (“report type, department, error type, etc.”). What you want to do here is have boxplots on the same graph that are plotted as (draft report (DR) report type A), (cleanse data (CD) report type A), (DR report type B, CD report type B)…etc.)
    What this buys you:
    1. An excellent understanding of how the means, medians, and standard deviations of reports, department, error type, etc are changing as a function of DR and CD. I am certain you will find a lot of commonality with respect to means, medians, and standard deviations between DR and CD for various reports, departments, etc. and you will also see where there are major differences between DR and CD for certain reports, departments, etc.

    2. The boxplots will very likely highlight additional issues such as data clustering and extreme data points within and between DR and CD. With this information you will be in a position to do additional checks on your data (why clustering here and not there, what is the story with the extreme data points, etc.)

    3. If the raw data on any given boxplot looks like it is clustered – run a histogram and see what the distribution looks like – if it is multimodal then you will have to review the data and figure out why that is so.

    Third – multivar analysis – I have no idea what default settings you used for running this test but I would guess it was just a simple linear fit. That is a huge and totally unwarranted assumption. So, again – plot the data. This time you will want to run matrix plots. You will want to choose the simple straight line fit command when you do these. The matrix plots will allow you to actually see (or not) if there is trending between one X variable and cycle time and with the inclusion of a linear fit you will also be able to see a) what is driving the fit and b) if a linear fit is the wrong choice.

    1. You will also want to run a matrix plot of all of the X’s of interest against one another. This will give you a visual summary of correlations between your X variables. If there is correlation (which, given the nature of your data, there will be) then that means you will have to remember when you run a regression that your variables are not really independent of one another within the block of data you are using. This means you will have to at least check your variables for co-linearity using Variance Inflation Factors (eigenvalues and condition indices would be better but these things are not available in most of the commercial programs – they are, however, available in R).

    Once you have done all of the above you will have an excellent understanding of what your data looks like and what that means with respect to any kind of analysis you want to run.

    For example – You said “I want to know with 95% confidence if the cycle time for cleanse task process is influenced by the source department that prepared the original data file with HO: µD1 = µD2 =µD3 = µD4 = µD5 and HA: At least one department is different from the others. Tests: Based on my A-D normality test done on the cycle time data, I concluded it was not normal and proceeded to test for variance instead of mean by using levene’s test since I have more than 3 samples (i.e.5 departments)”

    Your conclusion that non-normality precludes a test of department means and thus forces you to consider only department variances is incorrect.

    1. The boxplots and histograms will have told you if you need to adjust the data (remove extreme points, take into account multimodality, etc). before testing the means.

    2. The histograms of the five departments will have shown you just how extreme the cycle time distributions are for each of the 5 departments. My guess is that they will be asymmetric and have a tail on the right side of the distribution. All the Anderson-Darling test is telling you is that your distributions have tails that don’t approximate the tails of a normal distribution – big deal.

    Unfortunately, there are a lot of textbooks, blog sites, internet commentary, published papers which insist ANOVA needs to have data with a normal distribution. This is incorrect – like the t-test, ANOVA can deal with non-normal data – see The Design and Analysis of Industrial Experiments 2nd Edition – Davies pp.51-56 . The big issue with ANOVA is variance homogeneity. If the variances across the means are heterogeneous then what might happen (this will be a function of just how extreme the heterogeneity is) is ANOVA will declare non-significance in the differences of the means when, in fact, there is a difference. Since you have your graphs you will be able to decide for yourself if this is an issue. If heterogeneity is present and if you are really being pressed to make a statement concerning mean differences you would want to run Welch’s test.

    On the other hand, if you do have a situation of variance heterogeneity across departments, what you should do first is forget about looking for differences in means and try to figure out why the variation in cycle time in one or more departments is significantly worse than the rest and since you have your graphs you will know where to look. I guarantee if heterogeneity of the variances is present and if you can resolve this issue the impact on process improvement will be far greater than trying to shift a few means from one level to another.

    1
    #239378

    Robert Butler
    Participant

    Please understand I’m not trying to be snarky, condescending, or mean spirited but your post gives the impression that all you have done is dump some data in a machine, hit run, and dumped out a couple of statistics that suggest your data might not exactly fit a normal curve. To your credit it would appear you do know that one of the requirements for a basic process capability study is data that is approximately normal.

    Let’s back up and start again.

    I don’t have Minitab but I do know it has the capability to generate histograms of data and superimpose idealized plots of various distributions on the data histogram.

    Rule number 1: Plot your data and really look at it. One of the worst things you can do is run a bunch of normality tests (or any kind of test for that matter) and not take the time to visually examine your data. One of the many things plotting does is put all of the generated statistics in perspective. If you don’t plot your data you have no way of knowing if the tests you are running are actually telling you anything – they can be easily fooled by as little as one influential data point.

    So, having plotted your data:

    Question 1: What does it look like?
    Question 2: Which ideal overlaid distribution looks to be the closest approximation to what you have?

    Since you are looking at cycle time you know you cannot have values less than zero. My guess is you will probably have some kind of asymmetric distribution with a right hand tail. If you wish you can waste a lot of time trying to come up with an exact name for whatever your distribution is but given your kind of data it is probably safe to say it is approximately log normal and for what you are interested in doing that really is all you have to worry about.

    One caution: if your histogram has a right tail which appears to have “lumps” in it – in other words it is not a “reasonably” smooth tapering tail you will want to go back to your data and identify the data associated with those “lumps”. The reason being that kind of a pattern usually indicates you have multiple modes which means there are things going on in your process which you need to check before doing anything else.

    If we assume all is well with the distribution and visually it really is non-normal then the next thing to do is choose the selection in Minitab which computes process capability for non-normal data. I don’t know the exact string of commands but I do know they exist. I would suggest running a Google search and/or going over to the Minitab website and searching for something like capability calculations for non-normal data.

    1
    #238893

    Robert Butler
    Participant

    I had to look up the definition of IIOT – and the manager in charge of obfuscation proliferation strikes again … :-). If all you have is paper entries then someone somewhere is going to have to sit down and transcribe the data into some kind of electronic spreadsheet.

    I suppose their might be some kind of document scanner that could do this but my guess would be a scanner like that would require uniformity with respect to paper entries – something like a form that is always filled out the same way. Even if you could do that there would have to be a lot of data cleaning because paper records, even when in a form, can and will have all sorts of entries that do not meet the requirements for a given field. For example – say one field is supposed to be a recorded number for some aspect of the process. I guarantee you you will get everything from an actual number, for example 10, to anything approximating that entry: 10.0 , >10, <10, ten, 10?, na, ” ,etc.

    There are some ways to approach a data set of this kind. Since the process will have changed over time (and, most likely, the reporting and measurement systems will have as well) the first block of data to examine would be the most recent. Let’s say you have a year’s worth of data for a given process. First talk to the people doing the recording. The focus will be on their memories with respect to times during the last year when the process was functioning properly and times when it wasn’t.

    Using their memories as a guide, go into the paper records and pull samples from those time periods, find someone who can do the data entry, and transcribe the data and analyze what you have. If the memory of the people on the line is any good (and in my experience it is usually very good) your sample will provide you with an excellent initial picture of what the process has done and where it is at the moment.

    1
    #238892

    Robert Butler
    Participant

    This looks like a homework problem so rather than just give you the answers – let’s think about how you would go about addressing your questions.

    Question 1: How do you determine the average effect of A?
    Question 2: How do you determine the average effect of B?
    Question 3: If you have the average effect of A – what is another term for that average?
    Question 4: If you have the average effect of B – what is another term for that average?
    Question 5: How do you know that your average effect of A is independent of your estimate of your average effect of B?
    Question 6. If you wanted to know the effect of the interaction AxB what would you do to your design matrix to generate the column for AxB?
    Question 7: Given the column for AxB how would you determine its average effect?
    Question 8: What is another term for the average effect of AxB?

    If you can answer these questions you will not only have the answers to the questions you have asked but you will have an excellent understanding of the power of experimental design methods over any other kind of approach to assessing a process.

    4
    #238766

    Robert Butler
    Participant

    If that’s the case then the issue should be framed and analyzed in terms of what you mean by on-time delivery.

    1. What do you mean by on-time delivery? – need an operational definition here
    2. What does your customer consider to be on-time delivery? – need an operational definition here too.
    3. What do you need to take into account to adjust the on-time delivery definition for different delivery scenarios?
    a. If it is the physical delivery of a product to a company platform – how are you addressing things like, size of load, distance needed for travel, etc.
    b. If it is a matter of delivering something electronically – how are you adjusting for things like task difficulty, resource allocation to address a problem, etc.

    After you have your definitions and an understanding of how your system is set up for production you should be able to start looking for ways to assess your on-time delivery and also identify ways to improve it.

    0
    #238625

    Robert Butler
    Participant

    Not that normality has anything to do with control charting but, in your case, 100% represents a physical upper bound. If your process is operating “close” to the upper bound then the data should be non-normal – which would be expected. If it is “far enough” away from 100% then I could see where the data might be approximately normal – again not an issue – but if your process is that removed from 100% then it would seem the first priority would be to try to find out why this is the case instead of tracking sub-optimal behavior.

    To your specific question, since 100% is perfection and since you want to try tracking the process with a control chart what you want is a one sided control chart where the only interest is on falling below some limit.

    1
    #238524

    Robert Butler
    Participant

    Ok, that answers one of my questions. I still need to know what constitutes a rigorous design. Given your answer to the question of order I’m guessing all rigorous means is that you actually take the time to implement the design and not stop at some point and shift over to a WGP analysis (wonder-guess-putter), however, I would still like to know what you actually mean by “rigorous”.

    To your first question concerning two-three level factorial designs or any of the family of composite designs, the biggest issues for running a design are the issues that drove you to wanting to run a design in the first place – time, money, effort. What this typically means is taking the time to define the number of experiments you can afford to run and then constructing a design that will meet your constraints.

    From a political standpoint the biggest issue with running an experimental design is convincing the powers that be that this is indeed the best approach to assessing a problem. You need to remember that simple one-variable-at-a-time experimentation is what they were taught in college, it is what got them to the positions they occupy today and, of course, it does work.

    If you are in a production environment where you will be running the experiments in real time you will have to spend a lot of time not only explaining what you want to do but you will also have to work with the people involved in the production process to come up with ways of running your experiments with minimal disruption to the day-to-day operation. The political aspects of design augmentation can be difficult but, as someone who has run literally hundreds of designs in the industrial setting, I can assure you that the issues can be resolved to everyone’s satisfaction.

    0
    #238508

    Robert Butler
    Participant

    You will have to provide a little more detail before anyone can offer much.
    1. What is a rigorous experimental design?
    2. What are lower and higher orders?

    1
    #238438

    Robert Butler
    Participant

    Give two levels for each factor you have the following:
    Factor A, df = (a-1)
    Factor B, df = (b-1)
    Factor C, df = (c-1)
    Factor D, df = (d-1)

    in each case a = b = c = d = 2

    For each interaction
    AxB, df = (a-1)*(b-1)
    AxC, df = (a-1)*(c-1)
    etc.
    AxBxC, df = (a-1)*(b-1)*(c-1)
    etc.
    and
    AxBxCxD = (a-1)*(b-1)*(c-1)*(d-1)
    Error, df = a*b*c*d*(n-1) where n = number of replicates
    Total df = a*b*c*d*n -1

    As an aside – 3 full replicates is gross overkill. The whole point of experimental design is minimum effort for maximum information. Unless you have a very unusual process it is unlikely that 3 and 4 way interactions will mean anything to you physically. Under those circumstances just take the df associated with the three and the 4 way interactions and build a model using only mains and two ways – this dumps all of the df for the higher terms into the error and instead of running 48 experiments you will have run 16.

    The other option, assuming you actually have a process where 3 way and 4 way interactions are physically meaningful and the variables under consideration can all be treated as continuous, is to replicate just one or perhaps two of the design points – this will give you df for error and will also allow you to check the higher order interactions. Given that they can be treated as continuous a better bet would be to just add 2 center design points to your 2^4 design. This will give you a run of 18 experiments, an estimate of all mains, 2,3, and 4 way interactions, an estimate of error, and a check on the possible curvilinear behavior.

    1
    #238179

    Robert Butler
    Participant

    1. Control charts do not require normal data – see page 65 Understanding Statistical Process Control 2nd Edition – Wheeler and Chambers
    2. When you have the kind of count data you have it can be treated as continuous data – see page 3 Categorical Data Analysis 2nd Edition – Agresti
    3. I-MR would be the chart of choice – you should check to make sure your data is not autocorrelated. If it is you will have to take this into account when building the control limits – Wheeler and Chambers have the details.

    1
    #237270

    Robert Butler
    Participant

    DOE is a method for gathering data. Regression (any kind of regression) is a way to analyze that data. As a result it is not a matter of using DOE or regression.

    Experimental designs often have ordinal or yes/no variables as part of the X matrix. The main issue with a design is that you need to be able to control the variable levels in the experimental combinations that make up the design. In your case you said you have the following:

    A. $
    B. Name
    C. Social proof (others have said the same thing but here’s what they found…)
    D. Regular script

    Before anyone could offer anything with respect to the possibility of including these variables in a design you will need to explain what they are. You will also need to explain the scheduling rate measure – is this a patient scheduled yes/no or is it something else.

    However, in the interim, if we assume all of these variables are just yes/no responses then you could code them as -1/1 (no/yes). The problem is that you cannot build patients. Rather you have to take patients as they come into your
    hospital(?). What this means is that you have no way of randomizing patients to the combinations of these factors and this, in turn, means you have no way of knowing what variables, external to the study, are confounded with the 4 variables of interest which means you really won’t know if the significant variables in the final model are really those variables or a mix of unknown lurking variables that are confounded with your 4 chosen model terms.

    What you can do is the following: Set up your 8 point design and see if you have enough patients to populate each of the combinations. Given that this is the case, you can go ahead and build a model and, if the response is scheduled yes/no, the method of choice for data analysis would be logistic regression. The block of data you use will guarantee, for that block of material, your 4 variables of interest are sufficiently independent of one another. What this approach WILL NOT AND CANNOT GUARANTEE, is that the 4 variables of interest actually reflect the effect of just those 4 variables and are not an expression of a correlation with a variety of lurking and unknown variables.

    Another thing to remember is the coefficients for the variables in the final reduced logistic model are odds ratios which are not the same as coefficients in a regular regression model and cannot be viewed in the same way.

    If you don’t have enough patients to populate the combinations of an 8 point design you will have to take the block of data you do have and run regression diagnostics on the data to see just how many of the 4 variables of interest exhibit enough independence from one another to permit their inclusion in a multivariable model. All of the caveats listed above will still apply.

    What all of this will buy you is an assessment of the importance of the 4 variables to scheduling with the caveat that you cannot assume the described relationships mirror the relation between the chosen X variables and the response to the exclusion of all other possible X variables that could also impact the response. While not as definitive as a real DOE this approach will allow you to do more than build a series of univariable models for each X and the measured response.

    This entire process is not trivial and if, as I suspect, you are working in a hospital environment I would strongly recommend you talk this over with a biostatistician who has an understanding of experimental design and the diagnostic methods of Variance Inflation Factors and eigenvalue/condition indices.

    0
Viewing 100 posts - 1 through 100 (of 2,475 total)