# Robert Butler

## Forum Replies Created

Viewing 100 posts - 1 through 100 (of 2,504 total)
• Author
Posts
• #250538

Robert Butler
Participant

I agree with @Straydog – the issue is that of statistics.  If we assume you have a mathematical background that includes algebra and if, for whatever reason,you can’t take a course in basic statistics then I would recommend working your way through the following books:

1. A Cartoon Guide to Statistics – Gonick and Smith – I’ve recommended this book to a number of my engineers over the years.  It is a well written book and it does a good job of highlighting and explaining many of the basic statistical concepts.

Given that you have an understanding of algebra I would recommend you learn the basic principles of least-squares regression.  The best short description I know can be found on pages 8-30 of Applied Regression Analysis, 2nd Edition – Draper and Smith.  If you can borrow this book through inter-library loan you could copy these pages for future reference.

Once you understand the basic idea of simple regression I would recommend you get a copy of

2. Regression Analysis by Example – Chatterjee and Price

and work your way through that book.

The third book I would recommend would be

3. Statistical Methods – Snedecor and Cochran – whatever the latest edition might be

This book is statistical boilerplate – it is written in the form of if-you-want-to-do-this-then-you-will-have-to-run-that.

You don’t have to read all of #3 but it is a good basic how-to book for a lot of statistical techniques. It takes the time to give the reader an understanding of what the methods do and why you might want to use them.

0
#250505

Robert Butler
Participant

Things are fine on my end @cseider – hope all is well with you too.  As for the homework – it’s pretty quiet around these parts and I was feeling generous…    :-)

0
#250350

Robert Butler
Participant

It’s just the sums of the squares of the deviation from the mean divided by the total count minus 1.

So, for a single item the average is  (10 + 0)/2 = 5

The deviations from the mean are

10 – 5 = 5

0-5 = -5

The squares of those deviations are

5^2 = 25

(-5)^2 = 25

there are two observations so  2 – 1 = 1

Therefore the variance per item is (25 + 25)/1 = 50  and the standard deviation is the square root of 50 = 7.07

If this were a real situation, my first reaction would be: why bother with any calculations – better to find out why two different individuals are coming up with ratings that are the exact opposite of one another. As presented, if the two raters are the system, then there is no calibration whatsoever.

0
#250044

Robert Butler
Participant

I’m not sure where we’re going with this discussion.  The focus of your initial post was that of coil rejection due to defects.  In the follow up discussion I was left with the impression defect reduction and thus increased acceptance of coils was your main concern.  I was also left with the impression that one factor known be a contributor to defects was improper degreasing.  It was for this reason I suggested looking at defect types and using things like Pareto charts to help and perhaps guide the thinking with respect to identifying defect types and the production practices connected to them.

Your latest post gives the impression this isn’t your goal.  It seems like you are just concerned about your colleagues wanting a defect rating scale while you just want to run a simple defect yes/no count.  Under these circumstances the rating scale amounts to nothing more than colorful jargon and has no value because defect “level” is irrelevant. A defect is a defect and if there is a defect the product will be rejected.

1
#250028

Robert Butler
Participant

I’ve been giving your problem some thought and I can’t think of any single statistic that will capture consistency both with respect to stability of system variability and constancy of trending.  The issue you are facing is that of an evolution of both variation and trend across time. As far as I know, if time is involved you will have to take it into account which means you will have to do more than examine simple population statistics.  All of the methods I know for tracking consistency of variation and trend across time require graphs and regression/time series analysis.

0
#250021

Robert Butler
Participant

Based on your post there not only does not seem to be any need for a 1-10 rating but, it doesn’t sound like your colleague has any idea of how to build a meaningful 1-10 rating scale that would be of any value to you.

What your post does suggest is you have some understanding of why coils are rejected.  For example, you mentioned dark surface on wire is cause for rejection….and what else?

Assuming you are interested in reducing rejects then the first order of business is to quantify the reasons for rejection over some predetermined period of time and bean count their frequency of occurrence.  Once you have this information you can analyze it and see what you see.

For example:

1. Pareto chart the results to identify the most frequent reasons for rejection.  If all reasons for rejection are created equal then you could use that chart as a driver for examination of the process to find the cause(s) of the top two or three reasons for rejection.

2. Look at the frequency count of reasons for rejection over time – do any of them trend over time – daily, weekly, monthly, quarterly, etc.)  (I once ran an analysis of manufactured component failures and found a huge AM to PM delta – the reason – the plant wasn’t air conditioned and the temperature delta between morning and afternoon resulted in far more failures for PM production because part of the component production process turned out to be very temperature sensitive).

3. Check your raw material receiving – any trending in rejection reasons that appear to coincide with raw material lot changes/supplier changes, etc.

…and so on.

So, short answer – if your post is a reasonable representation of your situation then, at the moment, I don’t see any value in rating scale.

Changing subjects:

You said, “I know that there are plenty of tools to use on continuous data and it is easier to work with continuous data. It also follows normal distribution. And one does not need more data points for the analysis of continuous data compared to attribute data.”

1. There are plenty of tools to use to analyze attribute data too.

2. Continuous data will follow whatever the underlying distribution of that data happens to be – it could be normal but it could also be a myriad of other things. Continuity of data does not guarantee normality.

3. Data quantity can be an issue with respect to analyzing continuous vs attribute data but which will need more data will depend on what it is that you are trying to do and the kind of data you have.

1
#249953

Robert Butler
Participant

The question you are asking is one of a comparison of two proportions.  Specifically you want to know if the difference between 100% and 82.5% is statistically significant for a sample of 315.

The key is to recognize that the proportions are 315/315 and 260/315.  In other words, with a baseline of 315 in both cases we have a situation where there are 315 presentations and 315 successes and in the second case we have 260 presentations which means there are 260 successes and 55 “failures”.

As a result, you have a 2×2 table with 0 failures and 315 successes in the first case and 55 failures and 260 successes in the second case.  If you set this up and run the analysis you will find you have a p-value of < .0001 – so, there is a statistically significant difference between 100% and 82.5% for a base of 315 samples.

1
#249922

Robert Butler
Participant

The short answer to you second post is: there’s nothing like that out there.

Let’s take another look at what you are asking:

You said: “I was looking for some current affairs metric that can tell me if something like the historical average growth rate will continue or if a certain ratio will stay stable in the future.”

If a measurement such as you are requesting existed, the folks who buy and sell stocks would all be multi-billionaires because they would know with certainty which stock was going to do well forever and conversely. This is the reason you will find the following line in every stock offering prospectus:

“Past performance is no guarantee of future results.”

The best you can do is analyze data with an eye to identifying process characteristics that will adversely impact your process average growth/stability and, once you have identified those variables, put in place process elements that will minimize/eliminate their adverse impact.  Such an approach should give you some measure of growth rate maintenance/ratio stability for some period of time but not for all time. This is because there is no guarantee those variables you have identified and are controlling are the only variables (both now and in the future) which have a significant impact on your process.

0
#249895

Robert Butler
Participant

I might be missing something but, as written, your post appears to be expressing two different requirements.  You say you want to predict variation, specifically predict variation trending, however, the examples you give have nothing to do with prediction.  Both both CD and CV are strictly summations of current affairs – they don’t have any predictive capabilities.

For prediction you would need some kind of equation which would relate your process inputs to variation output.  It is possible to build such an item but the effort/cost involved would be substantial.   What you can do is run an experimental design on your process and then analyze the results using the Box-Meyer’s method to identify those variables that are significantly impacting your variation.  Armed with the results of that analysis you could construct a predictive equation for variation change.

0
#249852

Robert Butler
Participant

….an additional thought.  I was going over the case study and looking at the data set for Case #2 and it occurred to me that, if you want to, you can use that data set as a starting point for what would amount to an excellent self teaching exercise.

In the attachment I’ve rearranged the data to make the composite nature of the design more apparent.  The rows in yellow are the center point replicates and the rows in light blue/gray are the star points.

From my earlier post you know what the expressions for the reduced models look like and you know how to go about generating the full models.  So now you are in a position to run a lot of what-if scenarios.  For example:

1. See what happens to the model coefficients when you reduce the number of center points to just 2.

2. See what happens to the model form when you fractionate the full 2 level factorial design.

3. See what happens when you randomly drop one or more of the factorial experiments (this happens all the time in real life – one or more of the experiments won’t run, can’t be run, was not run at the levels indicated, etc.) and see what impact that has on the final model, on the coefficients of the remaining model terms, and how severe the loss has to be (how many horses have to die) before backward elimination and stepwise regression converge to different models.

4. In every one of the above situations you would want to run a complete regression analysis which would mean, among other things, a thorough graphical assessment of the residuals – I would recommend you borrow Chatterjee and Price’s book Regression Analysis by Example to guide you with respect to the world of regression analysis.

If you try these things and do some reading in the recommended book you should be able to gain a good understanding of DOE,

Switching subjects.

In the presentation you provided the author indicated the minimum and maximum values for the factorial part of the design were going to be coded as -1 and 1.  In real life it is rare to have a situation where, for each experiment in the design, you run at exactly the stated minimum and maximum.  What this could mean is the following:  it may be the situation that the data for Case 2 consists of the measurements of the output of each experiment but only records the ideal settings of the X variables in the design.  If this is the case then that would be another reason for my inability to exactly match the author’s predicted settings for minimum and maximum response output.

###### Attachments:
1. Goodrich1.xlsx
0
#249838

Robert Butler
Participant

The way you would determine the minimum and maximum settings shown on page 28 would be to take the regression models for collapse and burst pressure and put them in an optimizing program and run them.  I don’t know Minitab but I do know it has such an option.

The problem is the author has not provided the models he generated using the data.  Since he did provide the raw data for Case 2 I’m assuming he expects you to use that data to generate models for collapse and burst pressure (assuming you wish to check the validity of his slide)….and thereby hangs a tale….

Since you have asked and since you have indicated in your post you have been trying to confirm everything in his presentation, I went ahead and made up an Excel file of the data on the last slide and had a go at model building.

A test of the data matrix using VIF’s (Variance Inflation Factors) and condition indices indicates the design will support a model consisting of all main effects, all curvilinear effects, and all two way interactions.

Given the differences in the magnitudes of the three variables of interest it is best if, before you start running regressions, you take the time to scale pressure, weld time, and amplitude to the ranges of -1 to 1.  The reason you want to do this is because significant parameter selection can be influenced by parameter magnitude and all you really want is for parameter selection to be influenced only by the degree of correlation.***

To this end you would compute the following for each of the X variables

A = (Max X + Min X)/2

B = (Max X – Min X)/2

Scaled X = (X – A)/B

In order to identify the reduced model (the regression model containing only significant terms  – in this case terms with P < .05) you will want to run both backward elimination and stepwise (forward selection with replacement) regression on the design/response matrix.  The reason you want to do this is because you want to make sure both methods converge to the same reduced model.

Once you have the reduced model in terms of the scaled X’s you will  take these terms and re-run the reduced model using the raw X’s so your final model will have coefficients which are associated with actual X values.  If you do all of this what you get are the following models:

Predicted Collapse = pcollapse = -0.01985 + 0.00011785*Amplitude + 0.00138*Pressure + 0.05207*Weld_T -0.00001279*Pressure*Pressure

Predicted Burst = pburstp = -1715.33689 + 10.20952*Amplitude + 69.08662*Pressure + 3832.92683*Weld_T -0.96918*Pressure*Pressure

The problem is, when you plug in the optimal values for Amplitude, Pressure, and Weld Time, as indicated in the presentation, you do not get the indicated predicted values for Collapse and Burst Pressure.

As  check I built the full model just for burst pressure

fmburstp = 1718.56889 -20.45806*Amplitude + 21.56008*Pressure -10670*Weld_T + 0.18066*amplitude*amplitude

-0.92215*Pressure*Pressure + 18780*weld_t*weld_t + 0.06500*amplitude*pressure +  15.00000*amplitude*weld_t

+ 162.50000*pressure*weld_t

The predicted results for the full model for burst pressure are quite close to the values the author reports but they are not exact.

(pressure, weld time, amplitude) = (18, 0.26, 72) which gives (pcollapse, pburstp, fmburstp) = (0.022869, 945.85, 892.30)

(pressure, weld time, amplitude) = (22, 0.29, 78) which gives (pcollapse, pburstp, fmburstp) = (0.028612, 1243.38, 1222.66)

My guess is the author did one or both of the following:

1. He built the full model and did not run the proper regression analysis to identify the reduced models.

2. He rounded off the values for the optimum pressure, weld time, and amplitude when he wrote the report.

If he did use the full model then his analysis is wrong. The whole point of a design is to identify the significant parameters and to use the resulting model (once it is verified) for purposes of prediction and control.

If he rounded off the predicted optimum values then the best predictions you can generate with those values will not give you the Y values shown in his presentation.

*** If, in this instance, we run backward elimination and stepwise regression on the raw X values the two methods do not converge to the same model.

• This reply was modified 1 month, 2 weeks ago by Robert Butler. Reason: typo
0
#249757

Robert Butler
Participant

I might be missing something but my understanding of the scenario you described is as follows:

a. Quality team takes a 10% sample of the total output.  With a total output of 1000 this becomes a sample of 100.

b. Quality team does a 100% inspection of the 100 samples and finds 10 defects in that sample.  Therefore, their claim, based on the examination of the 100 samples, is the  quality score is 10/100 = 90%.  This value, as I understand your post, is the SLA.

The second crew takes the same batch of 100 samples, pulls 10 samples from this group and finds 1 defect.  From your post I understand this to be the ATA.

1. The ATA defect cannot be viewed as an additional defect to add to the 10 found during the 100% check (the SLA) because the first group did a 100% inspection and found a grand total of 10 defects. Therefore all the second group did was find one of the original defects in their 10 sample sub-sample of the 100 and confirm, via the smaller sample, what was found with the complete inspection of the larger sample.

What I don’t follow is the reasoning concerning the way the ATA sub-sample is taken and assessed.  After all, there is a chance of pulling 10 samples from the original 100 and finding no defects and there is also the chance of pulling 10 samples and finding you have drawn the original 10 defects.

0
#249476

Robert Butler
Participant

If you are going to re-sample the internal audit sample then the short answer is the sample size is no longer 100 it is 110 and 11/110 = 90%. Another way to look at this is you have a sample of 100 – you find 10 defects in that sample.  You dump everything back in the bin and run a re-sample of the same population taking 10 samples and finding 1 defect. Given what you found with 100 samples this is not surprising and your re-sampling is just a short hand confirmation of what you found with the larger sample.

If the people you are talking to insist on only adding an additional found defect to the original defect count and do not take into account the fact of all of the sample in the re-sample then why stop there?  Let’s see, I take the 100 samples and I re-test the 100 samples 10 times. The chances are I’ll get 10 defects each time I do this -therefore, at the end of 10 sample/re-sample cycles I’ll probably have 100 defects which when I add them together will give me 100/100 = 0% accuracy.

1
#249380

Robert Butler
Participant

This sounds like a homework problem but I’m done with my remote work and I’m not really ready to start back in on my latest book read so…

7 = Xbar1 = the average over 153 days.

Therefore

7 x 153 = the sum of the individual measures for the 153 day period (remember: the sum of individual measures/153 = average = 7)

12 = the desired average over 153 + 94 days

Therefore the sum of individual measures for the remaining 94 days = Xbar2*94

and the grand average which is 12 is expressed as:

[(7 x 153) + (Xbar2 x 94)]/(153+94) = 12

Solving for Xbar2 we get 20.1

0
#249273

Robert Butler
Participant

To do things like this you really need to spend the money, purchase some good books on experimental design, and read them and keep them close by as a reference.

The two I would recommend are:

1. Understanding Industrial Designed Experiments by Schmidt and Launsby

2. The Design and Analysis of Industrial Experiments by Davies

To your point – all composite designs have as their core construct a two level factorial design. This factorial design can be fractionated, as you have done, so the only remaining questions are:

1. How many center points?

2. What is the level of the alpha of the star points?

From the books we have the following:

To retain orthogonality the number of center points = Nc = 4*(sqrt(Nf) +1) -2k

Where Nc = number of center points,

Nf = number of experiments in the factorial = 16 – since you are running a fractional factorial

k = number of factors in the design = 5

Therefore Nc = 4*(sqrt(16) +1) – 2*5 = 10

The level of alpha for the star points of a rotatable composite is

alpha = (Nf)**(1/4) = (16)**(1/4) = 2.

Thus, the star points are located at +-2 for each of the factors which means you will have 10 additional runs where the minimum and maximum alphas are two times the minimum/maximum levels for each of the factors in the fractional factorial.

So, if you want a perfectly orthogonal rotatable composite design with a 2**(5-1) fractional factorial core you will need 16 +10 + 10 design points.  That is a lot of experimentation and if you are pressed for time and/or the cost per experiment is high it may be far more than you can run.

A possible alternative:

1. If your situation is that of just starting out then instead of trying to do everything at once (all mains, all two ways, all curvilinear) a better approach from the standpoint of time/money/effort would be to run a near saturated design (5 factors in 8 experiments) with two center points for replication and analyze the results of this design and see what you see.

This approach buys you a couple of things:

a. First – it gives you a check to see if there is any point in investigating all 5 factors and if there is any need to spend time looking for curvilinear behavior.  If you run this smaller design and find a) all five factors matter and/or b) there is serious curvilinear behavior present then, you can augment your basic design with additional points and fill out the full composite design.

b. It provides a reality check with respect to minimum and maximum settings for the design space.  I could bore you senseless with story after story about how everyone KNEW (I mean they REALLY KNEW) we could run the process at the specified minimum and maximum levels for all of the variables of interest in the design. Unfortunately, when we went to do this, Physical Reality raised its ugly head and said, “No way – go back and try again.”  In short, running the smaller design is great way to make sure you can really investigate the things that are of interest.  If the smaller design works with the levels of interest, all well and good and if it doesn’t you haven’t wasted valuable time/money/effort. Also if things don’t work, you can regroup, use what you have learned, and build a design with new minima and maxima for each variable that will permit an investigation.

2. Perfect orthogonality is a nice idea but, in practice, even with the number of runs listed above, you won’t achieve it for the simple reason that it is highly unlikely you will run each and every design point at the exact specified minimum or maximum settings for every experimental condition.  In those situations where I have actually run a composite design I’ve cut the number of center points to 3-4.  The design is a touch non-orthogonal but the degree of non-orthogonality is such that it doesn’t have any major impact on the results of the analysis.

0
#249222

Robert Butler
Participant

If your question is how do you get the percentages at the bottom of the page the answer is you need information that is not present on the spreadsheet you provided. In order to get the percentages at the bottom you need to know how the efforts are apportioned to each of the consultants. The best you can do with what you have is compute total utilization percent for each customer with all of the consultants lumped together…and even that calculation has issues since the footnote is far too cryptic and also does not provide sufficient information.

Consider:

1. the sum of the efforts for customer A is 402.

2. According to the footnote   Utilization = sum of all efforts against L1 or L2 consultant/available effort per shift (480)

….what exactly does this mean?

If we assume the above means:

Utilization = sum of all efforts for all consultants/available effort per shift (480)

Then – for customer A you have 402/480 x 100 = 83.75% which is somehow divided among 4 consultants in such a way that the total consultant utilization sums to 92% and that the 42% is split among 3 consultants while a single consultant accounts for 50% …and then there is the question concerning the remaining 8% of their collective time.

3. But….

a. 480 minutes = 8 hours

b. If I have 4 consultants for customer A this would mean I actually have 1920 minutes for that shift.

Therefore – for customer A you have 402/1920 x 100 = 20.93 % which is somehow divide among 4 consultants such that the split on the 20.93% is 42% and 50%.

So, the short answer to your question is this – based on what you have provided you cannot answer the question you asked.

0
#249065

Robert Butler
Participant

@cseider – well, welcome back…question – your elapsed time for your most recent group of responses is 62 minutes, 57 minutes, 14 minutes, 8 minutes, and 7 minutes….what happened between 57 and 14?  Lunch break?  :-)

Take Care

P.S. where I work the IT department is really on the ball as far as fighting Covid is concerned.  This morning I logged in to work and shortly after I signed on I happened to sneeze. A split second later the computer started running an anti-virus program.

0
#249046

Robert Butler
Participant

I’m not sure I understand what you mean by “statistical tools to create a baseline.”  Baseline data is just that – data gathered from a process before you attempt to make changes.  As for the data gathering itself – that is an entirely different matter.

1. How often – I’ll take everything I can get with as much detail as I can get. This means if it is hourly – I’ll take hourly, if the best anyone can do is daily – I’ll take that.  The thing to remember is this – you can always take whatever data you may have and make it coarser but you can’t do the reverse.  In other words, it is easy to convert hourly into daily or weekly or whatever but you can’t go the other way. To put it another way – if you are given continuous data in any form – keep it – do not record the data as binned data – after you have entered the data you can have the computer bin the data any way you want.

2. How far back in time – that will depend on a number of things and you will have to get the answers to those questions before you start gathering the data.

a. Are the measurements for whatever it is that you are measuring the same across time?  If not then there will be a time horizon beyond which it won’t make much sense to gather additional information. For example, for the last two years you have been taking measurements in millimeters/second, however, prior to that you were taking measurements in furlongs/fortnight. In this case your data horizon is two years.

b. Have there been any major changes to the way the process was run? Changes in measurement methods, switches in sources of major raw material suppliers, major changes/upgrades in machinery used in production, major changes in operation protocol, etc.

For example, up until last year we ran the process by looking at the settings of three gauges twice a day and made adjustments based on those readings. A year ago we went to hourly readings and adjustments and six months ago we went to automated continuous feedback.  The time horizon is now 6 months.

Consequently, If you have a situation where both a and b are true then there’s no point in going back any further than 6 months.

There are other issues but, in my experience, these are the big ones.

0
#248965

Robert Butler
Participant

In summary:

1.    [The defects are] not deep enough emboss thickness, not enough stretching, or telescoping or collapsing rolls (and what is telescoping and collapsing – are these other ways of expressing not enough stretching or are they something else?

2.    We know that a majority of these defects are caused by incorrect tension during the winding portion of the embossing process.

3.    One thought I had is to measure tension, unwind and rewind motor outputs, and some other machine outputs (temperatures and pressures)… to determine which machine output hs the greatest effect on the stretching of the film

Questions:

1.    How do you know what constitutes a majority of the defects?  Have you done a bean count of defect types and summarized your findings with a pareto chart or some other method to make sure you know which defect occurs the most often?

a. are all defects created equal – that is regardless of the defect the cost in terms of lost time/product/etc. is the same or is there an order to defect severity?

2.    How do you know this majority is connected to incorrect tension?

a.    Do you have required settings for tension?

b.    If so how were these settings determined?

c.     How well are they controlled?

3.  Do you have prior production data where you have simultaneous measures of tension and defect type and counts?

4.  For the third summary point – are the listed variables the ones you wanted to look at in your design or are they something else?

5.  You say you have a new way to check the tension. Can you run the new and old side by side without stopping production?

6.  If you are going to gather the data on the variables for the third summary point how do you plan to record the defects – type and frequency so you have some assurance of a connection between the process parameters and the defects?

If you can provide the answers to the above then perhaps I or someone else might be able to offer some suggestions.

0
#248913

Robert Butler
Participant

It looks like your plotting routine for the histogram is running some kind of default binning which is giving you a false impression of your data.  If you look at the linear plot of the data the histogram should have vertical bars at the same places -15, -10, -5, 0 , 5, 10, 15 but it doesn’t. Rather the histogram looks like the bars are at (maybe) -15, -10, -5, 0(?), 6, 10, 14.

If you know the actual data counts for the measures at -15, -10, -5, 0 , 5, 10, 15 just make a two column data set with -15, -10, -5, 0 , 5, 10, 15 in one column and the associated counts in the other and make a histogram using that data set.  Given that the data is very granular it still looks like the shape of the histogram should be acceptably normal.

After re-reading your initial post I guess the question I have is what is it that you wanted to do with the data?  If you are thinking about process control you could build an individuals control chart using the data but depending on what it is that you are trying to investigate that may not be what you really want to do.

Your focus on differences from an ideal suggests your real concern might be one of checking for random vs non-random trending in the differences over time.  Since everything is in increments of 5 it might be worth considering examining your data looking at runs above and below the median. The question you would answer with an analysis of this type is – are the differences above and below the median random over time or is there a pattern.  If a pattern occurred it would tell you there is special cause variation present and you need to investigate why.

Most good basic statistics text books will have the details for assessing runs above and below the median and the topic in the textbook will have the same title.

0
#248892

Robert Butler
Participant

@Straydog is correct – no subgrouping here just the individual data points.  Once you have a histogram built as he recommended (I’d also recommend plotting the data on a normal probability plot) post the graphs here and perhaps I or @Straydog or someone else may be able to offer additional thoughts.

0
#248872

Robert Butler
Participant

The central limit theorem applies to distributions of means so, the fact that the distribution of your means passes the Anderson-Darling test isn’t too surprising.

I’m not trying to be nasty or mean spirited here but the bigger issue is a basic mistake you have made – your post suggests you didn’t take the time to really examine your data. As written all you have done is take some data, dumped it into a normality test, found the test indicates the data is non-normal, and decided that it is, in fact, non-normal to a degree that really matters with respect to what you are doing.

The Anderson-Darling test, along with the other tests, is extremely sensitive to any deviation from ideal normality.  Indeed it possible for data taken from a generator of random numbers with an underlying normal distribution to fail one or more of these tests.

Before you do an analysis you need to examine your data and that means the very first thing to do is  plot the data in any way that is meaningful and see what you see. In this case, the minimum you should do is generate a histogram, generate a normality plot and a boxplot of the data.  Given what you are doing I’d also run a time plot of the data to see if there is any obvious underlying trending over time.

Once you have the normality plot you should look at it to see if it deviates from the reference straight line and then apply what is termed “the fat pencil test”.  This amounts to looking at the plot and checking to see if the plotted data can be covered with a fat pencil.

The various plots will tell you the following:

1. The histogram will give you an idea of the overall shape of the distribution of individual points.  The questions you would want to investigate with this plot would be:

a. Is the plot unimodal? – If it is bimodal then you have work to do.

b. Does the overall plot provide a visual appearance of something that is approximately normal?

c. Is there any data that gives a visual impression of being outliers?  Are there just a couple or are there a lot (yes, this is a judgement call)?

d. If there are just a few visual outliers, drop them from consideration and re-run your plots and your statistical tests. What do you see? Does everything change or do things stay much as they were?

2. The normal probability plot will give you a much better sense of the approximate normality of the data.

a. if the plot doesn’t approximate a fit to the reference straight line  but veers off sharply at either the low or the high end or exhibits clean breaks in the data with the subsets approximating straight lines that have large differences in slopes then you have work to do.

b. On the other hand, if the data is randomly scattered about either side of the line or if it is staying in the relative vicinity of the line but randomly drifting above and below the straight line then it is safe to assume the data approximates normality to an acceptable degree.

3. The boxplot will not only give you a good sense of the distribution shape it will also clearly characterize the behavior of the data in the tails as well as in the central part of the data distribution.

4. A time plot will tell you if your process is changing over time.

There are many individuals who view plotting data as something only a child should do. They find the idea of plotting data before running an analysis to be somewhat insulting and beneath their dignity. What they fail to understand is that a meaningful graph IS a statistical analysis.

I’m a statistician and I could bore you to tears with story after story about running an analysis that began and ended with basic plots of the data – in other words – once my engineer/doctor/line manager/scientist/division head/CEO/technician looked at the plots I had generated the project ended because the graphs told us everything we needed to know with respect to the source of the problem and its solution.

In order to get some idea of how, non-normal, data generated using a random number generator with an underlying normal distribution can look I would recommend you borrow a copy of Fitting Equations to Data – Daniel and Wood and look at the probability plots in the appendices of Chapter 3.

In order to get some idea of the power of real graphs I would recommend you borrow all four of Tufte’s books on graphs and graphical methods (The Visual Display of Quantitative Information, Visual Explanations, Envisioning Information and Beautiful Evidence) and “read” them. I put “read” in quotes because, while there is text in his books, the books are graphs, all kinds of graphs, and he provides the reader with a very clear visual understanding of graphical excellence and what a proper graph can do.

4
#248564

Robert Butler
Participant

I think you are still misunderstanding the issue of sample size. Your statement “…covers 80% of the population with 95% confidence…” has no meaning.

Let’s pretend you have done all of the usual things you need to do with respect to visually inspecting the distribution of those 100 samples (histogram, data on a normal probability plot, time plot of data, etc.) and what you have observed is a uni-modal distribution which is “acceptably” symmetric (this being a judgement call) and whose time plot is “acceptably” random (no obvious trending over time).

Given this the next step would be to compute the sample mean of those 100 samples and its associated standard deviation.

Armed with this information you can address any of the following questions:

1. How small a sample would I need to compare the mean of a new sample to the mean of the existing sample and be certain with a probability of 95% that the mean of the new sample was not significantly different from the mean of the 100 sample baseline?

2. How small a sample would I need to compare the standard deviation of a new sample to the standard deviation of the existing sample and be certain with a probability of 95% that the standard deviation of the new sample was not significantly different from the mean of the 100 sample baseline?

3. Given the 100 sample mean and the 100 sample standard deviation what kind of a spread around the sample mean would encompass 95% of the data – or, in your case, what kind of a spread around the sample mean would encompass 80% of the data where the data is:

a. existing/future individual samples
b. the means of samples of size X where X is is the count of a new sample whose total is something less than 100
c. the means of future samples of the same size as my original (100 samples)

Given what you have posted my guess (and I could be wrong about this) is question 3a is the one you want answered.

If you want to know the expected range of individual measurements around the mean of the results from your 100 samples then the equation would be:

#1 Range = sample mean of the 100  +- (1.987*sample standard deviation/sqrt(n))  where n is the sample size. Since you are interested in the estimate of the range for a single sample n = 1.  The 1.987 is an extrapolation for the fractional points of a t Distribution where for 95% the value for 60 samples is 2.00 and the value for 120 samples is 1.98.

So, the estimate of the range of single values for 80% of the population would be:

#2 Range = sample mean of the 100  +- (1.291*sample standard deviation/sqrt(n))  where, again, n = 1 and 1.291 is the extrapolation for the fractional points of a t Distribution where, for 80%,  the value for 60 samples is 1.296 and the value for 120 samples is 1.289.

Thus, if you took a new measurement from the same process and you wanted to be sure that single measurement could be viewed as coming from the central 80% of the distribution of your process output (as defined by you 100 sample check) you would examine it relative to the #2 Range – if it was inside those limits then you would have no reason to believe it was not a sample which was representative of the central 80% of your process output.

0
#248543

Robert Butler
Participant

As written, the answer to your question is to sample 80% of the data – there’s no estimation involved.

Sample size questions focus on moments of a distribution (things like means, standard deviations, etc.)of a sample and are phrased in the following manner: How many samples do I need to take in order to be certain that the mean/standard deviation/percent defective/percent response/percent change/etc. of the sample is not significantly different from a target?  Where the target can be things like – customer specified  means or standard deviations, gold standard targets, moment measures of an earlier sample from the process before we made a change, etc.

So, if you are interested in sample sizes then your question needs to be recast – for example:

I have 100 batches of process data. How many samples will I need to take to be 80% certain that my sample mean is within plus/minus some range of a target mean?

1
#248514

Robert Butler
Participant

@KatieBarry Ah Ha! So it IS resurrect old threads month!!!  :-)

I’m sorry to have to admit this but I don’t think I’ve ever checked the newsletter. Thanks for the heads up.

0
#248509

Robert Butler
Participant

@KatieBarry – is this resurrect old threads month?  This isn’t a complaint – just wondering.

0
#248449

Robert Butler
Participant

Minitab is great so I’m sure they appreciate the 2020 tip of the hat in their direction but this thread has been dormant since 2013 so the original poster might be a bit slow in responding.  :-)

0
#248418

Robert Butler
Participant

There isn’t a simple guideline. Which one of  the design methods you use will depend on what it is that you want to do and the kinds of variables you have.

For example:

1. Are the variables of interest continuous (or can they be treated as continuous), are they nominal, or are they a mixture of the two?

2. Are you trying to assess mixtures?

3. Are there known combinations of independent variables that either cannot be physically run or if running them would result in something you know will be other than you will need for the final results of an experiment?

4. What can you afford in terms of time/money/effort?

5. Are there restrictions with respect to randomizing the run order of your design which will require you to block your design on certain variables?

6. Is this going to be an initial investigation with a number of variables whose impact is: well defined, not well understood, of some interest because we suspect it might be important, included out of curiosity, or maybe a mix of all of the preceding.

7. Is this a situation where you are fairly certain of the effect of the variables of interest and are interested in seeing if there is anything better within the current parameter space?

8. Do you expect the relationship between the variable level and the output to be something other than a simple linear one?  If so do you have the time/money/effort/desire to check for all possible curvilinear behavior or only just a select few?

9. Do you think there might be some synergistic effects (interactions) between certain variables?  If so which ones?

etc.

Once you have the answers to questions like these then you can press on with a discussion concerning the merits of the various choices for DOE.  So, if you have something in mind please provide the answers to the above questions and either I or someone else will be happy to provide additional information/advice.

0
#248274

Robert Butler
Participant

If you have multiple responses then presumably you have a predictive equation for each of those responses.  If that is the case then you should have a minimum and a maximum level of desirability for each response.  If this is true then take your equations – put them in a program, assemble the matrix of X’s and run the equations against the matrix.  Next, tell the machine to sort through the predictions with the restrictions on minimum and maximum acceptable criteria for all of the responses (this done simultaneously) and find those that fall within your specifications.

What you will have will be a matrix consisting of settings and the corresponding predicted responses. Since it is very unlikely you will find a combination of X’s that gives you the best of everything you will have to look over the predictions and decide for yourself which set of X’s results in a group of predicted responses that are the “best”.

0
#247865

Robert Butler
Participant

So if I’m understanding correctly the question you are asking is how to generate a fit to the data in your plot that will be in the form of a predictive equation.

If that’s the case then, based on just looking at the plots, my first try would be a simple linear regression using the terms pipe motion, the square of pipe motion, stiffness and the interaction of stiffness and pipe motion and/or the interaction of stiffness and pipe motion squared.

The other first try option could be a form of one of Hoerl’s special functions with an additional interaction term.

The linear form would be ln(vessel motion) = function of ln(pipe motion), pipe motion, stiffness and an interaction term of either stiffness with pipe motion or stiffness with ln(pipe motion).

For every attempt at model building you will need to run a full residual analysis.  The plots of the residuals vs the predicted, independent variables, etc. will tell you what you need to know as far as things like model adequacy,  missed terms, influential data points, goodness of fit, etc.

Repeating this exercise with other terms that the residuals might suggest should ultimately result in a reasonable predictive model – be aware – any model of this type will have an error of prediction associated with it and your final decision with respect to model accuracy will have to take this into account.

If you don’t know much about regression methods you will need to borrow some books through inter-library loan.

I’d recommend the following:

Applied Regression Analysis – Draper and Smith – read, understand, and memorize the first chapter – the second chapter is just the first chapter in matrix notation and may not be of much use to you. You will need to read, thoroughly understand and do everything listed in Chapter 3 – The Examination of the Residuals.

Regression Analysis by Example – Chatterjee and Price – an excellent companion to the above and, as the book title says – it provides lots of examples.

Fitting Equations to Data – Daniel and Wood – from you standpoint the most useful pages of the book would probably be pages 19-27 (in the second edition – page numbers might be slightly different in later editions) for the chapter titled “One Independent Variable” – yes, I know, you have 2 variables – but the plots and the methods will help get you where you want to go.

You will also want to follow what these books have to say about models with two independent variables and interaction terms (also known as cross-product terms).

0
#247854

Robert Butler
Participant

I’m missing something.

The idea of an experimental design is that you have little or no idea of the functional relationship (if any) between variables you believe will have an effect on the output and the level(s) of the output itself.  You put together a design which consists of a series of experimental combinations of the independent variables of interest and you then go out into the field/factory/seabed floor/ hospital OR/whatever and physically construct the various experimental combinations, run those combinations, get whatever output you may get from the experimental run, and then use that data to construct a model.

It sounds like you already have a model which (I assume) is based on physical principles and mirrors what is known about seabed stiffness/pipe motion behavior.  If you already have a model (which your graph suggests is the case) then all that is left is either a situation where you want to go out on the seabed and run some actual experiments with pipe motion to see if the current model matches (within prediction limits) what is actually observed or a situation where you want to run some matrix of combinations of seabed stiffness and pipe motion just to see what the model predicts.

If it is the former situation then the quickest way to test your model would be to actually run a simple 2×2 design where you have two seabed conditions and two levels of pipe motion and where you try to find settings that are as extreme as you can make them.  If it is the latter then, unless it is very costly to run your model, I don’t see why you wouldn’t just run all of the combinations and see what you see.  The problem with the latter is, if you already have an acceptable model, I don’t see the point of running all of the calculations since it doesn’t sound like you have any “gold standard” (actual experimental data) for purposes of comparison.

Shifting subjects for a minute.  Let’s just talk about DOE.

You said, “My question is that: how can I define different levels for each factor (e.g. 6 levels for factor 1, and 11 levels for factor 2) in the DOE method? which method in this theory is beneficial for my project?

I have searched around these topics and some guys said me: ‘the “I-Optimal” method of the RSM is helpful for your project. because in this method you can define several levels for each factor and there are not any issues if levels of each factor are not the same as other factor’s levels’.”

If what you have written above is an accurate summary of what you were told then it is just plain wrong.

If the graph you have provided really mirrors what is going on then you need to ask yourself the following question:

Why would I need all of those levels of stiffness and pipe motion?  Consider this – two data points determine a straight line, 3 data points will define a curve that is quadratic in nature, 4 data points will define a cubic shaped curve, 5 data points will define a quartic curve, and 6 will give you a quintic.  The lines in your graph are simple curves – no inflections, no double or triple inflections – in short no reason they couldn’t be adequately described with measurements at three different levels and thus, no reason to bother with more than three levels.

As for the notion concerning I-optimality (or any of the other optimalities) – they are just computer aided designs.  The usual reason one will opt for a computer aided design is because there are combinations of independent variables that are known to be hazardous, impossible to run, or are known to provide results that will be of zero interest (for example – we want to make a solid – we know the combination of interest will only result in a liquid) and if we try to examine the region of interest using one of the standard designs we will wind up with one or more experiments of this type in the DOE matrix.

Finally, I’ve never heard of any design that requires all variables to have the same levels.  If this was said then my guess is what was meant was when you are running a design it is not necessary for, say, all of the high levels of a particular variable to be at exactly the same level.

If that was the case then, yes, this is true – but this is true for all designs – basically, as long as the range of “high” values for the “high” setting of a given variable do not cross over into the range of the “low” settings for that variable you will be able to use the data from the design to run your analysis.  Of course, if you do have this situation then you will need to run the analysis with the actual levels used and not with the ideal levels of the proposed design.

0
#247849

Robert Butler
Participant

Maybe we are just talking past one another but I’m not sure what you mean by “the numerical output from the process creates the shape of the model.”

A random variable is a function that takes a defined value for every point in sample space. – Statistical Theory and Methodology in Science and Engineering – Brownlee – pp. 21.  Given this I would say the process is the random variable and the output is just a quantification of what the random variable is doing. In other words – the random variable is going to determine the output distribution and all the numerical record is doing is cataloging that fact.

0
#247848

Robert Butler
Participant

Based on what you have posted the short answer to your question is – no – DOE won’t/can’t work in this situation.

The factors in an experimental design require either the ability to change the levels of each factor independently of one another or, in the case of mixture designs, to vary ratios of the variables in the mix.  In your situation you do not have the ability to change stiffness and pipe motion independently of one another – indeed it does not sound like you have the ability to change either of these variables in the real world setting.

The point of a DOE is to provide outputs that are the result of controlled, organized changes in inputs.  The DOE does not predict anything it only acts as a means of data gathering and once that data has been gathered you have to apply analytical tools such as regression to the existing data in order to understand/model the outputs of the experiments that were part of the DOE.

1
#247444

Robert Butler
Participant

I’m not sure what you mean by the regular rules.  If you mean the issue of control limits based on standard deviations about the means then you can still use them.  The issue is one of expressing the data in a form that will allow you to do this.  When you have an absolute lower bound which is 0 the usual practice is to add a tiny increment to the 0 values, log the data, find the mean and standard deviations in log units and then back transform.  What you will get will be a plot with asymmetric control limits.  Once you have that you can press on as usual.

0
#247378

Robert Butler
Participant

That would make sense if the aspect of the supplements you believe most impacts the outcome is the percentage of total nitrogen, otherwise it won’t matter.

The idea is if total nitrogen was understood to be the critical component of the supplements you would chose the two supplements with the lowest and highest levels of total nitrogen and run the analysis on the basis of nitrogen content. This would allow you to treat the supplements as continuous.

Another option would be sub-classes of raw materials and supplements.  In that case you might have some raw materials with an underlying continuous variable of one kind and others with a different one.  If you have this case then you would do the same thing as mentioned above.  Of course, this would mean building experiments with combinations of raw materials and supplements.  My guess would be you couldn’t do this with the catalysts so they would have to remain categorical variables.  However, if you can express the raw materials and supplements as functions of critical underlying continuous variables and can mix them together then the number of experiments would be drastically reduced.

For example, let’s say your raw materials fall into three groups with respect to a critical continuous variable and your supplements can be grouped into 4 separate groups – you could put together an 8 run saturated design with these and then run the saturated design for each one of the separate catalysts. That would give you a 32 run minimum.  If you tossed in a couple of replicates (I’d go for a random draw from each of the 4 saturated designs so I had one replicate for each catalyst) you would have 36 runs and you could get on with the analysis.

0
#247351

Robert Butler
Participant

If you have a situation where you can only run one raw material with no combinations of other raw materials and similarly for the catalysts and the supplements and you have no underlying variables that would allow you to convert them the continuous measures then you are stuck with running all the possible combinations which would be 480.  I understand the need for blocking so what you will want to do is draw up the list of the 480 combinations, put them in order on an Excel spread sheet – go out on the web and find a random number generator that will generate without replacement and generate the a random number sequence for 1-480.  Take that list, enter it in a second column in the Excel spread sheet so there is a 1-1 match between the two columns, sort on the second column so now 1-480 are in order and use that template to guide you in your selection of experiments (the first 10, then the second etc.)

As a check it would be worth randomly adding into the list a few of the experiments (again do a random selection) to permit a check with respect to run-to-run variability.

0
#247345

Robert Butler
Participant

You will have to provide some more information before anyone can offer much of anything in the way of advice.  You said, “I have 15 difference raw materials (A1, A2… A15), 4 catalysts (C1, C2, C3, C4) and 8 supplements (B1, B2…B8). ”

Questions:

1. Is this a situation where the final process can have combinations of raw materials, catalysts and supplements or is this a situation where you can only have a raw material, a catalyst and a supplement?

2. If this is the first case and you have absolutely no prior information concerning the impact of any of the raw materials, catalysts, or supplements on your measured response and you have no sense of any kind of a rank ordering of raw materials, catalysts, or supplements as far as what you think they might do then your best bet would be to construct a D optimal screen design with 27 variables – this would probably be a design with a maximum experiment count of 40 or less.

a. If you do have prior information and you can toss in any combination from raw materials, catalysts, and supplements then I’d recommend building a smaller screen design using only those items from raw materials, catalysts, and supplements that you suspect will have either a minimum impact or a major impact on the response.

3. If this is a case where you can only have one from column A, one from column B and one from column C is there any way to characterize the raw materials, catalysts, and supplements with some underlying physical/chemical variable that can will allow you to treat the raw materials, catalysts and supplements as continuous variables?

a. For example – let’s say the raw materials are from different suppliers but they are all the same sort of thing (plastic resin for example) and their primary differences can be summarized by examining say their funnel flow rates, the catalysts are all from different suppliers but they can be characterized by their efficiency at doing what they do and the supplements can also be characterized by whatever aspect of the final product they impact – say viscosity values.  If you have this situation then what you have is a situation with three variables at two levels.  In this case the lowest level for A would be the raw material that had the least of whatever it is and the high level would be the raw material that had the highest level of whatever and so on for catalyst and supplements.  This would be a design of about 10 experiments – a 2**3 with a couple of repeats.

4. If there isn’t any way to characterize the variables in the three categories and you have to treat everything as a type variable then you are stuck with having to run all combinations – 480.

1
#247303

Robert Butler
Participant

The short answer is the CLT doesn’t have anything to do with the capability analysis.

You are correct – the CLT refers to distributions of means not to the distribution of any kind of single samples from anything.  The statement from the WI (whatever that is)  “based on the CLT, a minimum sample size of n ≥ 30 (randomly and independently sampled parts) is required for a capability analysis. The stated rationale is that generally speaking, n ≥ 30 is a “large enough sample for the CLT to take effect”.”  comes from a complete misunderstanding of the CLT.  As for the second part about the 30 samples rule of thumb – that’s not worth much either.

If you need a counter-example to help someone understand the absurdity of that statement consider the case where the choices are binary and you take a sample of 30 and get a group of 0’s and a group of 1’s.  If you make a histogram of your measures you will have two bars – one at the value of 0 and the other at the value of 1 and no matter how hard you try that histogram cannot be made to look normal.

For a capability analysis you have to have some sort of confirmation that the single samples you have gathered exhibit an acceptable degree of normality.  The easiest way to do this is to take the data and plot it on normal probability paper and apply the fat pencil test.  If it passes that test you can use the data to generate an estimate of capability.

If it doesn’t, in particular if the plot diverges wildly from the perfect normal line and exhibits some kind of extreme curvature, you should consult Chapter 8 in Measuring Process Capability by Bothe. It is titled “Measuring Capability for Non-Normal Variable Data”.  That chapter provides the method for capability calculations when the data is non-normal.

1
#247224

Robert Butler
Participant

Control charts don’t test for normal distribution and, while the standard calculation for process capability requires approximately normal data, it is also true that process capability can be determined for systems where the natural underlying distribution of the data is non-normal.  Chapter 8 in Bothe’s book Measuring Process Capability titled “Measuring Capability for Non-Normal Variable Data” has the details concerning the calculation.

0
#247032

Robert Butler
Participant

@andy-parr  it’s a spammer.  They get through from time to time – I’ve had the same person/entity trying for the same thing.  I’ve seen a few in the past as well.  This one is just more persistent.

1
#246995

Robert Butler
Participant

From what I found after a quick Google search for the definition of TMV the big issue with that process is exactly what you stated in your second post: TMV focuses on the issue of repeatability and reproducibility.  Since assessment of repeatability and reproducibility are the main issue and since it was what I thought you were asking in your first post your question collapsed to one of determining a measure of the variability that could be used for test method validation. It was/is this question I am addressing and it has nothing to do with skill in using, or knowledge about, TMV.

The kind of process variability you will need for your TMV will have to reflect only the ordinary variability of the process. Ordinary process variation is bereft of special cause variation.  The variability associated with bi-modal or tri-modal data contains special cause variation.  As a result, should you try to use the variability measure from such data for a TMV your results will be, as I stated, “all over the map.”

I appreciate your explaining the issue with respect to the testing method – mandatory is mandatory and, I agree, there’s no point in worrying about it. I only questioned the methods since your first post left me with the impression you were just trying various things on your own.

So, taking your first and second posts together I think this is where you are.

1.       Your process can be multimodal for a variety of reasons.

2.       You don’t care about any of this since the spec is off in the west 40 somewhere and no cares about the humps and bumps.

3.       For whatever reason(s) identifying the sources of special cause variation impacting your test setup is not permitted.

4.       When you tried to control for some variables you though might drive multi-modality you still wound up with a bi-modal distribution in your experimental data.

5.       For the controlled study the two groups of product were distinctly different and both had a narrow distribution.

6.       You want to find some way to ignore/bypass the bi-modal nature of the controlled series of builds and come up with some way to use the data from the controlled build to generate an estimate of ordinary variation you can use for your TMV.

The easiest way to use your test data to attempt to get some kind of estimate of ordinary variation suitable for a TMV would be to go back to the data, identify which data points went with which mode, assign a dummy variable to the data points for each of the modes (say the number 1 for all of the data points associated with the first hump in the bi-modal distribution and number 2 for all of the data points in the second), and then run a regression of the measured properties against the two levels of the dummy variable.

Take the residuals from this regression and check them to see if they exhibit approximate normality (fat pencil test).  I’d also recommend plotting them against the predicted values and look at these residual plots just to make sure there isn’t some additional odd behavior.

If the normal probability plot indicates the residuals are acceptably normal, and if the residual plots don’t show anything odd, then you will take the residuals and compute their associated variability and use this variability as an estimate of the ordinary variability of your process.

The reason you can do this is because by regressing the data against the dummy variable you have removed the variation associated with the existence of the two peaks and what you will have left, in the form of the residuals, is data that has been correctly adjusted for bi-modality.

…and now the caveats

1.       You only have the results from the one controlled build – you are assuming the result of a second build will not only remain bi-modal but that the spread around the residuals from the analysis of data from the new build will not be significantly different from the first series.

2.       All you have done with the dummy variable regression is back out the bi-modal aspect – you have no idea if the residuals are hiding other sources of special cause variation, you have no idea how those unknown special causes may have impacted the spread of the two modes, and that means you have no guarantees of what you might see the next time you try to repeat your controlled analysis.

3.       If there is nothing else you can do then, as a simple matter of protecting yourself, I would recommend you insist on running a series of the same controlled build experiment over a period of time (say at three- month intervals) for at least a year and see what you see. My guess is, even if you manage to somehow only have two modes each time you run the experiment, you are going to see some big changes in the residual variability from test-to-test.

4.       If you should ever have to face a quality audit armed with only the results based on the above and the auditor is an industrial statistician you will have some explaining to do.

0
#246952

Robert Butler
Participant

After a Google search I’m guessing that TMV stands for Test Method Validation.  If this is the case then I think you need to back up and re-think you situation.

What follows is a very long winded discourse concerning your problem so, first things first. The short answer to your question is – no – running a TMV and pretending the bi-modal nature of your results does not matter is a fantastic way to go wrong with great assurance.

You said you have what appears to be a multimodal distribution of your output but everything meets customer spec.  Based on this and on some other things you said it sounds like the main question is not one of off-spec but just one of wondering why you get multimodal results.

1. You said,”We are now repeating TMV for that test, and due to its destructive nature, we must use ANOVA to determine %Tol. We also calculate STD ratio (max/min). 4 operators are required.”

2. You also said,”For TMV we limited the build process ranges – one temp, one operator etc and we have a distinctly bimodal distribution (19 data points between 0.850 and .894 and 21 data points between 1.135 and 1.1.163) LSL is 0.500. Reduction to a unimodal distribution is not worth the expense from a process standpoint, and we wouldnt know how to do so, since it may be incoming materials causing this distribution (All are validated, verified, suppliers audited etc. Huge headache, huge expense to make changes.)”

The whole point of test method validation is an assessment of such things as accuracy, precision, reproducibility, sensitivity, specificity, etc.  Since your product is all over the map and since your (second?) attempt at TMV gave results that were also all over the map for reasons unknown the data resulting from your attempt at TMV, as noted in #2 above, are of no value.  The data from #2 is not accurate and you cannot use that data to make any statements about test method precision.

I would go back to #2 and do it again and I would check the following:

1. Did I really have one temperature?

2. Was my operator really skilled and did he/she actually follow the test method protocol.

3. What about my “etc.” were all of those things really under control or if I couldn’t control them did I set up a study to randomize across those elements I thought might impact my results (shift change, in house temperature change, running on different lines, etc.)?

4. It’s nice to know the suppliers of incoming raw material are “validated, verified, suppliers audited etc.” but that really isn’t the issue. The two main questions are:

1) What does the lot-to-lot variation of all of those suppliers look like (both within a given supplier and, if two or more suppliers are selling you the “same” thing, across suppliers)?

2) When you ran the TMV in #2 did you make sure all of the ingredients for the process came from the same lot of material and from the same supplier?

I’m sure you have a situation where not all ingredients come from a single supplier but the question is this – for the TMV in #2 did you lock down the supplies for the various ingredients so that only one supplier and one specific lot from each of those suppliers was used in the study in #2?

The reason for asking this is because I’ve seen far too many cases where the suppliers had jumped through all of the hoops but when it came down to looking at the process the “same” material from two different suppliers or even the “same” material from a single supplier was not, in fact, the “same”.  The end result for this lack of “sameness” is often the exact situation you describe – multimodal distributions of final product properties.

A couple of questions/observations concerning point #1:

1. I don’t see why you think you need ANOVA to analyze the data – nothing in your description would warrant limiting yourself to just this one method of analysis.

2. You stated you needed 4 operators yet in #2 you said you were using one operator. As written both #1 and #2 are discussing TMV so why the difference in number of operators?

0
#246877

Robert Butler
Participant

I would disagree with respect to your view of a histogram for delivery time.  I would insist it is just what you need to start thinking about ways to address your problem.

Given: Based on your post – just-in-time (JIT) for everything is a non-starter.

Under these circumstances the issue becomes one of minimization of the time spread for late delivery.

You know 75% of the parts have a time spread of 0-6 days.  What is the time spread for 80,90,95, and 99%?  Since all parts are created equal and all that matters is delivery time then the focus should be on those parts whose delivery time is in excess of say 95% of the other parts.  Once you know which parts are in the upper 5% (we’re assuming the spread in late delivery time for 95% of the parts is the initial target) the issue becomes one of looking for ways to insure a delivery time for the 5% is less than or equal to whatever the upper limit in late delivery time is for the 95% majority.

Having dealt with things like this in the past my guess is you will first have to really understand the issues surrounding the delivery times of the upper 5%.  I would also guess once you do this you will find all sorts of things you can change/modify to pull late delivery times for the upper 5% into the late delivery time range of the 95%.

Once you have the upper 5% in “control” you can choose a new cut point for maximum late delivery time and repeat.  At some point you will most likely find a range of late delivery time which, while not JIT, is such that any attempt at further reduction in late delivery time will not be cost effective.

2
#246851

Robert Butler
Participant

Found it.   It’s about 8 months back and it is the same question – the question also has other wrong answers as discussed in the thread.

https://www.isixsigma.com/topic/lssbb-exam-question-help/

0
#246850

Robert Butler
Participant

This is the same garbage question another poster asked about several months ago – the answer was wrong then and it is wrong now.

Three factors at two levels is a 2**3 design = 8 points.  There is no such thing as a 32 factorial design.

Since it is evening I’ll rummage around a bit and see if I can find the earlier post.  If I’m remembering correctly there are other wrong answers on that test.

0
#246570

Robert Butler
Participant

It’s not a matter of calculating the t value and then transferring  it into a p-value.  What you are doing is computing a t statistic and then checking that to see if the value you get meets the test for significance.  Small t-values breed large p-values and conversely.  :-)

You are doing what I recommended and your plot is telling you what you computed – there is a slight offset between the two groups of data which means that the averages are numerically different but there really is no statistically significant difference.  If you want an assessment of the means then you could turn the analysis around – treat the yes/no as categorical and run a one way ANOVA on the results.  You’ll get the mean values for the two choices and you will also get no significant difference.

All of the above is based on commenting on the overall plot you have provided.  There are a couple of interesting things about your plot you should investigate. It is a small sample size and it looks like your lack of significance is driven  by 4 points – the two points at -1 that are much lower than the main group and the two points at 1 that are higher than the main group.

As a check – remove those 4 points and run the analysis again – my guess is you will either have statistical significance (P < .05) or be very close.  If you do get significance, and if it is a situation where you were expecting significance, then you will want to go back to the data to see if there is anything out of the ordinary with respect to those 4 points.  If there is something that physically differentiates those points from their respective populations and if their deletion results in significance then I would recommend the following:

1. Report out the findings with all of the data.

2. Include the plot as part of the report.

3. Emphasize the small size of the sample.

4. Report the results with the deletion of the 4 points.

5. Comment on what you have found to be different about the  4 points.

6. Recommend additional samples be taken with an eye towards controlling for whatever it was that you found to be different about the 4 points and re-run the analysis with an increased sample size to see what you see.

• This reply was modified 7 months, 2 weeks ago by Robert Butler. Reason: typo
1
#246525

Robert Butler
Participant

For the casual reader who is late to this spectacle and who does not wish to plow through all 94 posts allow me to provide a summary.

The OP has a pure theory, based on axioms and postulates of his choosing with no demonstrable connection to empirical evidence, which he has used to evaluate various and sundry aspects of MINITAB analysis and works by Montgomery.  His theory generates results that are wildly at odds with tried and tested methods of analysis as delineated by MINITAB and Montgomery.

Through a series of polite and courteous posts it has been pointed out to the OP that extraordinary claims, such as his, require extraordinary proof and that the burden of providing this proof is on his shoulders and not on the shoulders of those whom he has insisted are incorrect.

The OP’s response to this well understood tenet of scientific inquiry has been one continuous run of Proof-By-Volume consisting of raw, naked, ugly, disgusting, hateful, ad hominem attacks on anyone questioning his theory and the conclusions he has drawn from it. The OP has made it very clear he not only has no intention of providing such proof but he also does not think proof is necessary.

His intransigence with respect to the idea of the need to provide extraordinary proof is mirrored in his choice of words. An assessment of his 51 posts (as of 5 March 2020) indicates his favorite words and word assemblies are (in no particular order)

Wrong – 47 times – applied to everyone but himself

Theory – 52 times – by far the favorite

Ignorance, Incompetence and Presumptuousness – all together or at least one occurrence of any of the three words – 41 times  – again – as applied to everyone else

As the most prolific responder to his posts I’ve tried to politely point out the shortcomings of his theory and why it isn’t viable.  It has been, as I thought it probably would be, a vain effort. The only thing I’ve received in return is a barrage of ad hominem attacks and multiple accusation of waffling (6 times) (to waffle – to be unable to make a decision, to talk a lot without giving any useful information or answers). Waffling – back at you OP.

There really isn’t anything else to say – the OP has constructed a theory based on postulates and axioms of his choosing. He has reported the results of his theory as though they were correct while the empirical evidence he himself has compiled in this regard says otherwise, he has provided no proof of the correctness of his theory, and he reacts violently to any suggestion that his theory is wrong.  I’m sure the OP will want to have the last word, as well as the one following that – go for it Fausto  – QED (Quite Enough Discussion)

1
#246500

Robert Butler
Participant

As I noted in my post to your thread yesterday your situation is exactly that described by Abraham Kaplan and you have confirmed this to be the case in your post where you specifically stated:

“For me THEORY is

·              ·         The set of all LOGIC Deductions

·              ·         Drawn

·              ·         From Axioms and Postulates

·              ·         Which allow to go LOGICALLY from Hypotheses to Theses

·              ·         Such that ALL Intelligent and Sensible people CAN DRAW the SAME RESUL”

So, to reiterate – what you have is a pure theory which has no basis in empirical fact and which is, as you are so fond of saying – wrong.  What is particularly interesting about your situation is that you have conclusively proved you are wrong and you have demonstrated that fact time and again.

In the scientific world when one constructs what they believe to be a new theory the very first thing they do is make sure their theory is capable of accounting for all of the prior known facts in whatever area it is they are working.  Montgomery’s work, the methods and results of T Charts, etc. have been around for a long time and have withstood the close scrutiny and testing by myriads of scientists, engineers, statisticians, health practitioner, etc.  No one has found anything wrong with them.

If a real scientist constructs a theory and finds it at odds with prior known facts the first thing that individual will do is choose to believe there is something wrong with the theory and spend time either trying to adjust the theory to match the known facts or, if that proves to be impossible, scrap the theory and start over.  In your case, you tested your theory and found it at odds with Montgomery et.al. and came to the instant conclusion your theory, based on a set of axioms and postulates of your choosing, was correct and everyone else was wrong.

As Kaplan noted – the world your theory has deduced will only mirror reality to the extent that the premise mirrors reality. The empirical evidence you have compiled in your uploaded rants about MINITAB and Montgomery’s text repeatedly confirm the fact that your theory (and most certainly the postulates and axioms upon which it is based) does not mirror reality and is therefore wrong.

Instead of recognizing this very obvious fact you have chosen to spin the failures of your theory as successes, indulge in self-glorification, and shout down and denigrate anyone and anything that challenges your personal view of yourself and your theory. This isn’t science. It is raw, naked, ugly, disgusting, hateful, ad hominem attacks of the lowest order – in other words – Proof By Volume. Mussolini would be so proud.

As for your demands for my peer reviewed papers all I can say is let’s stay focused on the topic of this thread.  This thread was started by you. It is not about me nor anyone else who has participated. It is about a guy named Fausto Galetto and his pure theory based on axioms and postulates which do not incorporate empirical evidence. Let’s keep it that way.

0
#246473

Robert Butler
Participant

In all the time I’ve been a participant on the Isixsigma Forums, this has to be the saddest thread I’ve seen or been a part of.

Whether you wish to acknowledge it or not @fausto.galetto, this is your situation:

From: The Conduct of Inquiry – Abraham Kaplan pp. 34-35

“It is in the empirical component that science is differentiated from fantasy. An inner coherence, even strict self-consistency, may mark a delusional system as well as a scientific one. Paraphrasing Archimedes we may each of us declare, “Give me a premise to stand on and I will deduce a world!” But it will be a fantasy world except in so far as the premise gives it a measure of reality. And it is experience alone that gives us realistic premises.”

You have a premise – your theory – and you deduced a world. And, based on your scornful rejection of @David007 recommendation for simulations as well as the lack of actual data from processes you personally have run and controlled, that is all you have – a theory bereft of empirical evidence.

You wrote a paper describing the world resulting from your theory and you sent your description of that world to Quality Engineering. They gave your world some careful thought, compared it to the empirical evidence they had (as well as the current world view based on that evidence), found your world description wanting, and rejected it.

Based on your posts it appears you did the same thing with MINITAB. They too, examined your world, found it did not agree with, nor adequately describe, the empirical evidence they had on hand, and chose to ignore it.

All through the exchange of posts to this thread you have made it very clear that Fausto Galetto’s world based on pure theory is the only correct one and anyone who does not agree is ignorant, incompetent and presumptuous.  You, of course, have a right to believe these things but everyone else has a right to believe otherwise.

What makes this exchange of posts particularly depressing is the idea you believe a company like MINITAB would know about a serious flaw in their T Chart routine and do nothing about it. You seem to forget that hospitals and medical facilities routinely use the MINITAB T Chart.  If there was something seriously wrong with that part of the program there would be ample empirical evidence in the form of lots of dead patients attesting to the fact: There aren’t.   If there had ever been anything seriously wrong with the T Chart program the statisticians at MINITAB would have made proper corrections to the program before it was ever offered in their statistics package.

• This reply was modified 7 months, 3 weeks ago by Robert Butler. Reason: typo
0
#246438

Robert Butler
Participant

@fausto.galetto – sorry, I’m with @David007 on this one.  Your latest response(s) to me (and to him) are true to form – invective and shouting.  I know this is just poking the Tar Baby and prolonging a very sad string of posts at the top of the masthead of a very good site, but you really should re-read what you wrote and contrast that with my last post. There is nothing in any of the 8 statements you cited that have anything to do with THEORY – as you are so fond of putting it.

Given your violent reaction to anything you perceive as contrary to your views, I actually took some time to look over some of the numerous things you have uploaded to that storage site ResearchGate.  I also took some time to check out the reputation of that site. Three things are immediately evident:

1. The noble concept of “open access” is DOA and you are taking the life of your grant and the status of your professional reputation/career in your hands if you try to use anything on those sites as a basis/starting point for your work.

2. If the “paper” you sent to  Quality Engineering looked anything like the stuff you have uploaded to that storage site (open access with no apparent oversight) it is not surprising it was rejected.

3. Based on what you have posted here and on ResearchGate it appears you strongly favor the PBV method of scientific discourse (Proof By Volume).  That method works well in the world of politics and extreme religious movements but does not advance your cause (and hopefully it never will) in the world of engineering and science.

1
#246422

Robert Butler
Participant

The following statements – (I won’t bother to bold them since you have already)

1. “There is no “hidden theory”. Joel Smith has a good paper on t charts …Control charts for Nonnormal data are well documented”

2. “is your issue that the data is neither exponential nor Weibull, but something else?”

3. “If you are saying the chart fails to detect assignable causes, try simulating some exponential data.”

4. “But as I’ve repeatedly said with n=20 the model is going to be wrong anyway.”

5. “You really should play with some simulated exponential data. You’ll be surprised by what you see as “inherent variation””

6. “Minitab tech support is not needed. Calculate the scale (Exponential) or scale and shape (Weibull) using maximum likelihood. Compute .135 and 99.865 percentiles.”

7. “I anxiously await to see your rebuttal paper in any of Quality Engineering, Journal of Quality Technology, Technometrics, Journal of Applied Statistics, etc.”

8. “along with your spam practice”

are not goads, are not impolite, and they most certainly are not “IGNORANCE, INCOMPETENCE and PRESUMPTUOUSNESS” as those words are generally understood.

To goad is to provoke someone to stimulate some action or reaction – there is nothing provocative in any of the cited quotes. A rational assessment of the 8 quotes cited above is as follows: 1-6 are just plain statements of fact and/or polite questions concerning your take on the issue – nothing more.  #7 is a very polite and logical response to all of your posts that preceded it.  And #8 is an objective observation concerning your posting practice.

In your numerous posts to this thread you have made it very clear you interpret anything that goes against your personal views as arising from “IGNORANCE, INCOMPETENCE and PRESUMPTUOUSNESS”. What you choose to ignore is the obvious fact that your personal views are not shared by the people you are addressing on this forum, by the body of practitioners of science/engineering/process control in general, nor by the body of reviewed and published scientific/engineering literature.

1
#246416

Robert Butler
Participant

@fausto.galetto – In your post to @David007 on February 29 at 4:39 AM you spewed out a lot of hate and you concluded with the statement “Tell me your publications SO THAT I CAN READ them…. and I will come back to you and to your “””LIKERS”””!”

@David007
has been courteous and polite in every response he has made to your postings.  Your responses to his posts are nothing but shouting and invective. They are, by turns, irrational, ugly, viscous, denigrating, and hateful.  They have no place on a forum of this type.

1
#246342

Robert Butler
Participant

The problem you are describing has two parts.  The first part isn’t one of correlation rather it is one of agreement.  Things can be highly correlated but not be in agreement.  I would recommend you use the Bland-Altman approach to address the agreement part of the problem.

Bland-Altman

1. On a point by point basis take the differences between the two methods.

2. You would like to be able to compare these differences to a true value but since that is rarely available use the grand mean of the differences. Compute the standard deviation of the grand mean of the differences and identify the +- 1,2, and 3 standard deviations

3. Next compute the averages of the two measures that were used for computing each difference.

4. Plot the differences (Y) against their respective average values (X).

5. If you have a computer package that will allow you to run a kernel smoother on this data and draw the fitted line do so.  If you don’t then you can use a simple linear regression and regress Y against X and look at the significance of the slope.  If you can only do the simple linear fit you will want to plot that line on the graph.  The main reason for doing this is to make sure the significance (or lack thereof) isn’t due to a few extreme points.

To determine agreement you will want to see if the mean of the differences is close to zero. If there is an offset, but no trending then you will have identified a bias in the measures (based on the example you have provided this is to be expected).  If you have a significant trending and it is not being driven by a few data points then you will have a situation where the two methods are not in agreement. If they are not in agreement then you have some major issues to address.

The second part is the issue of the approximate delta of .25 relative to the lower (9.8) and upper (13.8) limits.

It looks like you have the following possibilities:

1. The difference is approximately .25 and both processes are inside the 9.8-13.8 range

2. The difference is approximately .25 and one process is outside either the upper or lower target and the other process isn’t.

3. The difference is approximately .25 and both processes are outside of either the upper or lower targets.

Your post gives the impression that a single instance of the approximate delta of .25 being associated with either of the processes outside the targets counts as a fail.  If this is true then the question becomes one of asking what are the probabilities of getting a difference of approximately .25 given that one or both of the processes are outside the target range and building a decision tree based on what you find.

0
#246301

Robert Butler
Participant

As you noted – given a big enough sample (or a small one for that matter), even from a package that generates random numbers based on an underlying normal distribution, there is an excellent chance you will fail one or more of the statistical tests… and that’s the problem – the tests are extremely sensitive to any deviation from perfect normality (too pointed a peak, too heavy tails, a couple of data points just a little way away from 3 std, etc.) which is why you should always plot your data on a normal probability plot (histograms are OK but they depend on binning and can be easily fooled) and look at how it behaves relative the perfect reference line.  Once you have the plot you should do a visual inspection of the data using what is often referred to as “the fat pencil” test – if you can cover the vast majority of the points with a fat pencil placed on top of the reference line (and, yes, it is a judgement call) it is reasonable to assume the data is acceptably normal and can be used for calculating things like the Cpk.

I would recommend you calibrate your eyeballs by plotting different sized samples from a random number generator with an underlying distribution on normal probability paper to get some idea of just how odd this kind of data can look (you should also run a suite of tests on the generated data to see how often the data fails one or more of them).

I can’t provide a citation for the hyper-sensitivity of the various tests but to get some idea of just how odd samples from a normal population can look I’d recommend borrowing a copy of Fitting Equations to Data – Daniel and Wood (mine is the 2nd edition) and look at Appendix 3A – Cumulative Distribution Plots of Random Normal Deviates.

I would also recommend you look at normal probability plots of data from distributions such as the bimodal, exponential, log normal, and any other underlying distribution that might be of interest so that you will have a visual understanding of what normal plots of data from these distributions look like.

0
#246279

Robert Butler
Participant

I think you will have to provide more information before anyone can offer much in the way of suggestions.  My understanding is you have two populations which have a specified (fixed) difference between them and you want to know if this fixed difference is less than or greater than the difference between the lowest and highest spec limits associated with the two populations.  If this is the case then all you have is an absolute comparison between two fixed numerical differences.  Under these circumstances there’s nothing to test – either the fixed delta is greater or less than the fixed delta between greatest and least spec and things like correlation have nothing to do with it.

0
#246276

Robert Butler
Participant

If the measurements within an operator for each part are independent measures and not repeated measures then you could take the results across parts for each operator and regress the measurements against part type (you would have to construct a dummy variable for parts to do this).  The variability of the residuals from this regression would be a measure of the variation within each operator.

If you then reversed the process and took all of the measurements for a single part across operators and did the same thing with operators as the predictor variable the variation of the residuals for that model would be a measure of the variability within a part.

If you wanted to get an estimate of the process variation controlling for part type and operator you would take all of the data and build a regression with part and operator as the predictor variables.  The variation of the residuals would be an estimate of ordinary process variation with the effect of operators and parts removed.

In all of these cases you would want to do a thorough graphical analysis of the residuals.  Since you would have the time stamp of each measurement you would want to include a plot of residuals over time as part of the residual analysis effort.  If you’ve never done residual analysis then I would recommend taking the time to read about how to do it.  My personal choice would be Chapter 3 of Draper and Smith’s book Applied Regression Analysis – 2nd edition – you should be able to get this book through inter-library loan.

The big issue is that of operator measurements on the same part.  You can’t just give an operator a part and have him/her run a sequence of three successive measurements – these would be repeated measures not independent measures and the variation associated with them would be much less than the actual variation within an operator.  You would need to space the measurements of the parts out over some period of time and so that each measurement of a given part by each operator has some semblance of independence.

0
#246274

Robert Butler
Participant

The yes/no is the X variable and the  percentage is the Y  therefore just code no = -1 and yes = 1 and regress the percentage against those two values.  As for the p-value the “traditional” choice for significance of a p-value is < .05 so, using that criteria a p of .127 says you don’t have a significant correlation between the yes/no and the percentage.  This would argue for the case that whatever is associated with yes/no is not having an impact on the percentages.

Since correlation does not guarantee causation and causation will not guarantee you will find correlation what you need to do (you should do this in every instance anyway) is put your residuals through a wringer before concluding that nothing is happening.  You would want to plot the residuals against the predicted values as well as against the yes/no response.  If there are other things you know about the data (for instance, you know it was gathered over time and you have a time stamp for each piece of data) you will want to look at the data and the residuals against these variables as well.

Since you had a good reason for suspecting a relationship a check of the residual patterns will help you find data behavior that might account for your lack of significance.  If the data structure is adversely impacting the regression you may see things like clusters of data, a few extreme data points, trends in yes/no choices that are non-random over time, etc. which are adversely influencing the correlation.

If you should find such patterns you would want to identify the data points to which they correspond and re-run the regression. If the revised analysis results in statistical significance then you will need to go back to the process and try to identify the source of the influential data points. If it turns out there is something physically wrong with those data points you could justify eliminating them from the analysis and reporting your findings but you will also want to make sure that you clearly discuss this decision in your report.

1
#246212

Robert Butler
Participant

There’s any number of ways you could do this.  You could run a regression with percentage as the Y and yes/no as the X.  You could use a two-sample t-test with the percentages corresponding to a “no” as one group and the percentages corresponding to a “yes” as the other group.  If the distribution of the percentages is crazy non-normal you might want to run the t-test and the Wilcoxon-Mann-Whitney test side by side to see if they agree (both either find or do not find a significant difference).

0
#246013

Robert Butler
Participant

The short answer to your question is there isn’t a short answer to your question.  The explanation provided by Wikipedia is a good starting place but you will have to take some time to not only carefully read and understand what is being said but also put pencil to paper and work through some math.

I’ve spent most of my professional career as an engineering statistician and biostatistician. I always keep paper and pencil handy when I’m reading a technical article and I take the time to convert what I think I have read into mathematical expressions.  I find when I do this I not only gain a better understanding of what I’ve read but, if I’ve misunderstood what was written, I find it is easier to see that I have misunderstood. This, in turn, helps me on my way to a correct understanding.

Perhaps if you had written out, in mathematical form, the highlighted section of the quote in your initial post you would have realized your question “How can you have several squared deviations of mean? Square it and it is just (mean)(mean) …?” is incorrect. The highlighted segment does not say “squared deviations of mean” it says “several squared deviations FROM that mean”  In mathematical terms it is saying (x(i) – mean)*(x(i) – mean) where x(i) is an individual measurement. The difference between the previous squared expression and what you wrote is all the difference in the world.

0
#245896

Robert Butler
Participant

The phrasing of several of the questions is terrible…and there is one that is simply wrong.  Now that we have that out of the way my thoughts on the questions are as follows:

Question 46: I would choose B.  When you lay a t-square over the plot there is no way the minimum at 6 is anywhere near 100. If they are claiming D is the correct answer then I don’t see how that’s possible.  “Between 2-4” means anything between those two numbers so you would look at the values from 2.01 to 3.99 and, depending on which line you choose (regression line, 95% CI on means, 95% CI on individuals) you can get a 20-40 range.

As an aside- if you are going to ask questions like this you owe it to the people taking the test to provide a graph that has enough detail to permit an assessment of questions such as this one – If I ever made a presentation where the required level of detail with respect to output was 10 units my investigators would be really upset if I gave them a graph with a Y axis in increments of 50.

Question 49: I would choose D.  If the residuals can be described by a linear regression then the residual pattern is telling you that you have some unknown/uncontrolled variable that is impacting your process in a non-random fashion.  The good books on regression analysis will typically show examples of residual plots where the plot is telling you that you have something else you need to consider.  The three most common examples are residuals forming a straight line, residuals forming a curvilinear line, and residuals forming a < or a > pattern.

Question 52: Of the group of choices C is correct but the question is terrible – I’ve never had a case where a manager asked me for the average cost of an item rather the issue was always the cost per item which means you calculate the prediction error based on the CI for individuals.  What they are doing is using the CI on means.  Thus for 3 standard deviations (the industry norm) you would have 4060 + 3*98/sqrt(35) = 4060 + 49.7 = 4109.7.  Your choice is wrong because the issue is not the average cost, rather it is the maximum average cost and a maximum average of 4200 means you are going to have a fair percentage of the average product costs in excess of 4200 – how much would depend on the distribution of the cost means relative to some grand mean.

Question 152: I agree, the answer is B.  Please understand that what I’m about to say is not meant to be condescending.  You need to carefully re-read the question – RTY is a function of …..What?

Question 184:  Your equations shouldn’t have the sqrt of 360 in them – the problem states the process standard deviation is 2.2 therefore that is not s it is sigma.  The issue is this – for the first equation the Z value is 1 and for the second it is -.681.  If you look at a Z table you will find for Z = 1 the area under the curve coming in from -infinity is .8413. If you subtract this value from 1 you get the estimate for the percentage of the production that are in excess of 8.4 and that is 15.87% and 15.87% of 360 = 57.13 which rounds down to their answer.

However, there is the issue of the product outside the lower bound. This is determined by your second equation where Z = -.681. The area under the curve coming in from – infinity to -.681 is .2483. This means the total percentage of off spec (above the upper limit and below the lower limit) is .1587 + .2483 = .4070 which means the total off spec is .4070 x 360 = 146.

Question 209:  The null hypothesis is that the mean is less than or equal to \$4200 therefore the alternate hypothesis is that the mean is greater than \$4200.  All of the other verbiage in the problem has nothing to do with the statement of the null and alternative hypothesis so I would agree with E.

1
#245682

Robert Butler
Participant

The question you are asking is one of binary proportions. Since you have a sample probability for failure (p = r/n where r = number of failures and n = total number of trials) in the one environment you also have a measure of the standard deviation of that proportion  = sqrt([p*(1-p)]/n).  You also have a measure of the proportion of failures in a different environment where you took 40 samples and found 0 failures.  You can use this information to compare the two proportions as well as determine the number of samples needed to make statements concerning the odds that the failure rate in the new environment is zero and the degree of certainty that the failure rate is 0.

Rather than try to provide an adequate summary of the mathematics needed to do this I would recommend you look up the subject of proportion comparison and sample size calculations for estimates of differences in proportions.  I can’t point to anything on the web but I can recommend two books that cover the subject – you should be able to get both of these through inter-library loan.

1. An Introduction to Medical Statistics – 3rd Edition – Bland

2. Statistical Methods for Rates and Proportions – 3rd Edition – Fleiss, Levin, Paik

0
#245559

Robert Butler
Participant

@cseider – …could be… Happy New Year to you too.   :-)

0
#245555

Robert Butler
Participant

The original focus of this exchange was the issue of the validity of the underpinnings of the T-chart.  Your conjecture is that you have developed a theory and this theory is correct and the, apparently to you, unknown theory behind the T-chart is wrong.  Your approach to “proving” this is to

1. Ask for a bunch of MBB’s to provide a solution to problem(s) you have posed with the understood assumption that if they don’t exactly reproduce your results then they are wrong.

2. Demand that the folks at Minitab provide the underlying theory of the T chart and along the way prove you are wrong.

As I said in an earlier post to this exchange – extraordinary  claims require extraordinary proof and the burden of the proof is on the individual making the claims not on the people against whom the claims are made. I sincerely doubt you will hear anything from Minitab for the simple reason that, to the best of their knowledge, (as well as to the best of the knowledge of anyone who might be using their T-chart program for process control) what they have is just fine.

It really doesn’t matter what you think about whatever theory you have built nor does it matter that you think you have done whatever it is that you have done correctly.  The issue is this – do you have a “reasonable” amount of actual data where you can conclusively show the following:

1. Specific instances where the Minitab T-chart control declared a process to be in control only to find that it really wasn’t in control and when your approach was applied to the same data, your approach identified the out-of-control situation.

2. Specific instances where the Minitab T-chart control declared a process in control, it was found to be in control, and when the data was analyzed with your approach you too found it to be in control.

3. Instances where your approach declared a process to be out-of-control only to find later it was not out-of-control and, when the data was re-analyzed with the Minitab T-chart, their methods identified the process as in control.

4. Instances where your approach declared the process to be in control only to find it was out-of-control and when checked with the Minitab approach the out-of-control situation was correctly identified.

5. …and, once everything is tallied – what kind of results do you have – a meaningful improvement in correct identification of out-of-control situations when dealing with processes needing a T-chart (while simultaneously guarding against increases in false positives) or just some kind of change that, in the long run, is at best, no better than what is currently in use.

You have made a big point about citing some of Deming’s statements concerning theory.  Although I can’t recall a specific quote, one big point he made was the need for real data before you did anything else.  His book Quality, Productivity, and the Competitive Position is essentially a monument to that point.  What you have presented on this forum is a theory bereft of evidence – just claiming everyone else is wrong because it violates your personal theory is not evidence.  The kind of evidence you need is as listed in the four points above.  If you have that kind of evidence and if what you have appears to be a genuine improvement,  then, as I’ve said before, the proper venue for presentation is to a peer reviewed journal that deals in such matters.

In this light, your rebuttal concerning “thousands” is without merit.  The point of this thread was that the Minitab T-chart method was wrong and nothing more and it was in this context that you made the claim that “thousands” of MBB’s were wrong.  The “proof” you provided in your most recent thread is based on a claim that the entire Minitab analysis package is wrong. In addition, in an attempt to inflate the “thousands” estimate, you dragged in poor old Montgomery and declared his book has BIG errors.  Statements such as these are not proof – they are just simple gainsaying.

1.  Your rejoinder concerning the Riemann Hypothesis is standard misdirection boilerplate – the fact remains that if your proof had had any merit it would have made the pages of at least one of the journals on mathematics.

2. How many people have I seen who have admitted a mistake in a publication?  Quite a few – you can find corrections and outright retractions with respect to papers in peer reviewed journals if you take the time to go looking for them.  One good aspect of the peer review process is that the vast majority of mistakes are found, and corrected, before the paper sees publication.

I happen to know this is true because I do peer reviewing for the statistics sections of papers submitted to one of the scientific journals. I can’t tell you the number of mistakes in statistical analysis I have found in submitted papers.  As part of the review process what I do is point out the mistake(s), provide sufficient citations/instructions concerning the correct method to be used, and request the authors re-run the analysis in the manner indicated.  The length of the written recommendations varies.  The longest I think I’ve ever written ran to almost two full pages of text which were accompanied by attachments in the form of scans of specific pages in some statistical texts.  In that instance the authors re-ran the analysis as requested, were able to adjust the reported findings (most of the significant findings did not need to be changed, a couple needed to be dropped, and, most importantly, they found a few things they didn’t know they had which made the paper even better), addressed the concerns of the other reviewers and their paper was accepted for publication.

3.  In my career as a statistician, more than 20 of those years were spent as an engineering statistician in industry supporting industrial research, quality improvement, process control, exploratory research, and much more.  I too did not have time to send articles in for peer review and I too made presentations at conferences.  Some of those presentations were picked up by various peer reviewed journals and, after some re-writing for purposes of condensation/clarification saw the light of day as a published paper. It was because of my experience that I asked about yours.  I thought the question was particularly relevant in light of your position concerning your certainty about the correctness of your efforts.

4. As for reading you archived paper on peer review – no thanks.  I seem to recall when I checked you had uploaded more than one paper on the subject of peer review to that site and, if memory serves me correctly, all of them had titles suggesting the text was going to be nothing more than a running complaint about the process.

The peer review process is not perfect and can be very stressful and frustrating – I know this from personal experience.  Sometimes the intransigence of the reviewer is enough to drive a person to thoughts of giving up their profession and becoming an itinerant  beachcomber.  However, based on everything I’ve experienced, the process has far more pluses than minuses and, given the inherent restrictions of time/money/effort of the process, I have yet to see anything that would be a marked improvement.

0
#245538

Robert Butler
Participant

Oh well, in for a penny in for a pound.

@fausto

I said the following:

/////

1. A quick look at the file you have attached strongly suggests you are trying to market your consulting services – this is a violation of the TOS of this site.

2. If you want to challenge the authors of the books/papers you have cited, the proper approach would be for you to submit your efforts to a peer reviewed journal specializing in process/quality control.

/////

To which you responded:

/////

Dear Robert Butler,

NO CONSULTING SERVICES TO BE SOLD!

I am looking for solutions provided by the Master Black Belts (professionals of SIX SIGMA).

I do not want to challenge the authors!!!

THEY did not provide the solution!!!!!!!!!!!!!!

I repeat:

I am looking for solutions provided by the Master Black Belts (professionals of SIX SIGMA).

/////

…and what followed was a series of posts that demonstrated you were indeed challenging authors and you were not looking for solutions since, by your own admission, you had already solved them and you thought they were wrong and you were right.

A couple of asides:

1. You say “Since then THOUSANDS of MASTER Black Belts have been unable to solve the cases.”

a. And you determined this count of “Thousands” how?

b. Why would the fact that “Thousands” of MBB’s have been unable to solve the cases in the way you think they should solve it matter?

c. Given you are challenging the the approach of various authors of books and papers what makes you think your version is correct?

d. I only ask “c” because one can find out on the web your proof of the “so-called Riemann Hypothesis” – your words – submitted to some general non-peer reviewed archive on 5 October 2018 which was incorrect (in a follow up article to the same archive you said it was “a very stupid error”). It would appear, even with this correction the proof was still wrong because, according to Science News for 24 May 2019, the proof of the Riemann Hypothesis (or conjecture) is still in doubt.  The article states “Ken Ono and colleagues” are working on an approach and a summary of their most recent efforts can be found in the Proceedings of the National Academy of Sciences.

2. It’s nice that Juran praised one of your papers at a symposium – the big question is this – which peer reviewed journal published the work?

0
#245473

Robert Butler
Participant

What an IMPRESSIVE retort!!!  @fausto.galetto DO YOU appreciate the INEFFABLE TWADDLE of this entire thread (see, I can yell, insult, and boldface type too) ?

Let’s do a recap:

1. OP initial post says he wants a solution for two items in an attached document.

2. I do a quick skim and offer the comment that I think the correct venue for the OP would be a peer reviewed paper since it certainly looked like a challenge to the authors/papers cited.

3. The OP responds and assures me this is not the case – all he wants is for some generic six sigma master black belts to provide a solution. Given what follows later it is obvious he doesn’t want a solution – just confirmation of his views.

4. @Darth offers a suggestion and a recommendation to try running Minitab – and in the following OP post @Darth gets slapped down for his suggestions.

5. Reluctantly, the OP announces he has downloaded Minitab “BECAUSE NOBODY tird to solve the cases”

6. The OP then lays into Minitab on this forum because “MInitab does not provide the Theory for T Chart….

BETTER I DID NOT FIND in Minitab the THEORY!!! IF someone knows it PLEASE provide INFORMATION”

7. @Darth tries again with a Minitab analysis and provides a graph of the results – the OP slaps him down again.

8. The OP shifts gears – suddenly, somehow, the OP knows “MINITAB makes a WRONG analysis: the Process is OUT OF CONTROL!!!!!!!!!!!!!!!!!! I asked to Minitab the THEORY of T Charts!!!!!

9. The OP then slams me for stating he hasn’t tried to solve the cases. OK, fine, I’ll assume he did try. I probably should have said the OP hadn’t tried to do any research with respect to finding the theory of T control charts or really understanding anything about them other than insist Minitab drop everything and get back to him with the demanded information pronto! – but this really doesn’t matter because, as one can see in the other posts, the OP KNOWS Minitab is wrong because he has a proof based on THEORY!  What theory he doesn’t say but apparently it doesn’t matter since this THEORY is better than the THEORY of T Charts – even though, by his own admission, he doesn’t know what that theory is.

Given the bombast I decided to see what I could find.  I couldn’t find a single peer reviewed paper by the OP listed in either Pubmed or Jstor.  Granted there are other venues but these two happen to be open to the public and are reasonably extensive.

I did find a list of papers the OP has presented/published at various times and I found some vanity press (SPS) – publish on demand books by the OP – of the group I found the title of his book  “The Six Sigma HOAX versus the Golden Integral Quality Approach LEGACY”  interesting because the title, as it appears on the illustrated book cover on Amazon, is written in exactly the same manner as the OP’s postings.  Given the book title I find it odd that the OP would ask anyone on a Six Sigma site for help. After all, with a title like that one would assume the OP views practitioners of Six Sigma as little more than frauds and con artists.

In summary – the OP believes he has found Minitab to be in error and he wants some generic master black belts to confirm his belief.  I do know the folks at Minitab really know statistics and process control. I also know many of the statisticians at Minitab have numerous papers in the peer-reviewed press on statistics and process control and have made presentations at more technical meetings than I could list.  Under those circumstances, if the OP really thinks he is right and Minitab is wrong then the OP first needs to take the time to find, and thoroughly understand, the theory behind the T chart – this would require actually researching the topic – books, peer reviewed papers, etc.

In science, extraordinary claims require extraordinary proof and the burden of the proof is on the challenger, not on those being challenged.  If, after doing some extensive research,  the OP is still convinced he is right then the proper venue for a challenge of this kind is publication in a peer reviewed journal that addresses issues of this sort.

2
#245469

Robert Butler
Participant

There are two equations one for alpha and one for beta.

What you need to define is what constitutes a critical difference between any two populations (max value, min value)

What you have is the sample size per population (8) and for the conditions you have specified you have -1.645 (for the alpha = .05) and +1.282 ( for beta = .1).

the alpha equation is  (Yc – Max value)/(2/sqrt(n)) = -1.645

the beta equation is (Yc – Min value)/(2/sqrt(n)) = 1.282

subtract the two equations and solve for n.

Obviously you can turn these equations around – plug in n = 8 and solve for the beta value and look it up on a cumulative standardized normal distribution function table and see what you have.

A point of order:  As I mentioned earlier ANOVA is robust with respect to non-normality. Question: with 8 samples how did you determine the samples were really non-normal?  If you used one of the many mathematical tests for non-normality – don’t.  Those tests are so sensitive to non-normality that you can get significant non-normality declarations from them when using data that has been generated by a random number generator with an underlying normal distribution.  What you want to do is plot the values on probability paper and look at the plot – use the “fat pencil test” – a visual check to see how well the data points form a straight line.

If you are interested in seeing just how odd various sized random samples from a normal distribution look I would recommend you get a copy of Daniel and Wood’s book Fitting Equations to Data through inter-library loan and look at the plots they provide on pages 34-43 (in second edition – page numbers may differ for later editions)

1
#245009

Robert Butler
Participant

your post is confusing.  What I think you are saying is the following:

You have 13 groups and each group has 8 samples.

The data is non-normal

You checked the data using box-plots and didn’t see any outliers

You chose the Kruskal-Wallis test because you had more than 2 groups to compare.

You ran this test and found a significant p-value.

If the above is correct then there are a few things to consider:

1. You don’t detect outliers using a boxplot. The term outlier with respect to a boxplot is not the same thing as a data point whose location relative to the main distribution is suspect.

2. ANOVA is robust to non-normality – run ANOVA and see if you get the same thing – namely a significant p-value.  If you do get a significant p-value then the two tests are telling you the same thing.

The issue you are left with – regardless of the chosen test is this – which group or perhaps groups are significantly different from the rest. In ANOVA you can run a comparison of everything against everything and use the Tukey-Kramer adjustment to correct for multiple comparisons.  In the case of the K-W you would use Dunn’s test for the multiple comparisons to determine the specific group differences.

If you don’t have access to Dunn’s test you will have to run a sequence of Wilcoxon-Mann-Whitney tests and use a Bonferroni correction for the multiple comparisons.

0
#244938

Robert Butler
Participant

I agree nobody, including yourself, tried to solve the cases.  As I’m sure others will tell you, the usual procedure on this site is for you to try to solve the problem, post a reasonable summary of your efforts on a thread and then ask for help/suggestions.

@Darth
pointed you in a direction and gave you a reference.  Given what he offered your approach should have been to get a copy of the book referenced, learn about G and T charts, and do the work manually. Agreed it would have taken more time than if you had a program handy but by doing things manually (at least once) you would have learned a great deal and also solved your problem.

0
#244910

Robert Butler
Participant

Based on your post it sounds like you have no idea where your process is or what to expect.  If that is the case then the first thing to do is just run some samples and see what you see – how many to run will be a function of the time/money/effort you are permitted to use for what will amount to a pilot study whose sole purpose will be to let you know where your process is at the moment and, assuming all is well, an initial estimate of the kind of variation you can expect under conditions of ordinary variation.

0
#244899

Robert Butler
Participant

During the time I provided statistical support to R&D groups my method of support went something like this:

1. Have the researchers fill me in on what they were trying to do.

2. Walk me through their initial efforts with respect to quick checks of concepts, ideas – this would usually include the description of the  problem as  well as providing me with their initial data.

3. Ask them, based on what they had uncovered, what they thought the key variables might be with respect to doing whatever it was they were trying to do.

4. Take their data, and my notes and look at what they had given me and using graphing and simple regression summaries see if what they had told me was supported in some ways by my analysis of their data.

5. If everything seemed to be in order I would then recommend building a basic main effects design.  As part of that effort I would ask them for their opinions on combinations of variables they thought would give them what they were looking for.  I would make sure the design space covered the levels of the “sure fire” experiments and, if they were few enough in number (and in vast majority of cases they were) I would include those experiments as additional points in the design matrix.

6. Once we ran the experiments I would run the analysis with and without the “sure fire” experiments and report out my findings.  Part of the reporting would include using the basic predictive equations I had developed to run “what if” analysis looking for the combination of variables that gave the best trade off with respect to the various desired outcomes (in most cases there were at least 10 outcomes of interest and the odds of hitting the “best” for all of them was very small).  Usually, we would have a meeting with all of the outcome equations in a program and my engineers/scientists would ask for predictions based on a number of independent variable combinations.

7. If it looked like the desired optimum was inside the design box we would test some of the predictions by running the experimental combinations that had been identified as providing the best trade-off.

8. If it looked like the desired optimum was outside the design box we would use the predictive equations to identify the directions we should be looking.  In those cases when it looked like what we wanted was outside of where we had looked my engineers/scientists would want to think about the findings and, usually with my involvement, run a small series of one-at-a-time experimental checks.  Often I was able to take these one-at-a-time experiments and combine them with the design effort just to see what happened to the predictive equations.

9. If they were satisfied that the design really was pointing in the right direction we would usually go back to #5 and try again.

10.  If the results of the analysis of the DOE didn’t turn up much of anything it was disappointing but, since we had run a design which happened to include the “sure fire” experiments we would discuss what we had done, and if no one had anything to add it meant the engineers/scientists had to go back and re-think things. However, they were going back with the knowledge that the odds were high that something other than what they had included in the design was the key to the problem and that it was their job to try to figure out what it/they were.

Point #10 isn’t often mentioned but, in my experience it is often a game changer.  The people working on a research problem are good – the company wouldn’t have hired them if they thought otherwise.  Because they are good and because they have had successes in the past the situation will sometimes occur where the research group gets split into factions of the kind we-know-what-we-are-doing-and-you-don’t.  There’s nothing like a near total failure to get everyone back on the same team – especially when all of the “known” “sure-fire” experiments have been run and found wanting.

2
#244849

Robert Butler
Participant

A quick Google search indicates push, pull, and conwip are all variations on a theme – basically they focus on having work in progress throughout the line (CONWIP = Constant Work in Progress).  The same Google search indicates line balancing is just the practice of dividing the production line tasks into what are called equal portions with the idea that labor idle time is reduced to a minimum. Based on this quick search it would seem the issue is which flavor of production control do you want to emphasize rather than wondering about integrating one with the other because all of them appear to have the same goal – minimum worker idle time and maximum throughput.

0
#244538

Robert Butler
Participant

All of the cases of which I’m aware of that have run simulations of the results of an experimental design have had actual response measurements from which to extrapolate.  If the simulation is not grounded in fact then all your simulation is going to be is some very expensive science fiction.

A few questions/observations and an approach to assessing what you have that might be of some value.

1.       You said you identified the critical factors using a number of tools.  In order for those tools to work you had to have actual measured responses to go along with them.

a.       What kind of factors – continuous, categorical, nominal? A mix of all three? Or, what?

b.       Apparently, you have some sort of measure or types of measures you view as correlating with record accuracy. What kind of measurements are these – simple binary – correct/incorrect? Some kind of ordered scale – correct, incorrect but no worries, incorrect but may or may not matter, incorrect and some worries, seriously incorrect? Or is the accuracy measure some kind of percentage or other measure that would meet the traditional definition of a continuous variable?

2.       You said running an experimental design would entail cost.  All experimental efforts cost money. The question you need to ask is this:  Given that running a design would cost money and given that I have taken the time to generate a reasonable estimate of this cost – what is the trade off?  Specifically, if I were to spend this money, how much could I expect to gain if I identified a way to increase accuracy by X percent?  We are, of course, assuming the accuracy issue is physically/financially meaningful and one where the change in percent would also be physically/financially meaningful.   I make note of this because if you are at some high level of accuracy like 99% then you can only hope for some fraction of a percent change and the question would be what would a tiny fraction of a change in the final 1% translate into with respect dollars saved/gained?

If you refuse to run even a simple main effects design (you should note that one does not need to interrupt the process in order to do this) then you are left with the happenstance data you gathered and tested using the tools you mentioned in your initial post.

In this case you could do the following: Take the block of data you gathered and check the matrix of critical factors you have identified (the X’s) for acceptable degrees of independence from one another.  The usual way to do this is to use eigenvalues and their associated condition indices and run a backward elimination on the X’s dropping the X with the highest condition index and then re-running the matrix on the reduced X matrix.  You would continue this process until you have a group of X’s that are “sufficiently” independent of one another within that block of data. An often-used criteria for this measure of sufficiency is having all remaining X variables with a condition index < 10.  If you are interested, the book Regression Diagnostics by Belsley, Kuh, and Welsch has the details.

What this buys you is the following: you will know which X’s WITHIN the block of data you are using are sufficiently independent of one another.  What you WON’T KNOW and can never know is the confounding of these variables with any process variables not included in the block of data and the confounding of these variables with unknown variables that might have been changing when the data you are using was gathered.  You will also need to remember your reduced list of critical factors will likely fail to include variables you know to be important to your process.  This failure will probably be due to the fact that variables known to be important to a process are being controlled which means the variability they could contribute to the block of data you have gathered is not significant – in short you will have a case of causation with no correlation.

Keeping in mind these caveats, you can take your existing block of data and build a multivariable model using those critical factors you have identified as being sufficiently independent of one another. Run a backward elimination regression on the outcome measures using this subgroup of factors to generate your reduced model. Test the reduced model in the usual fashion (residual analysis, goodness of fit – and make sure you do this by examining residual plots – Regression Analysis by Example by Chatterjee and Price has the details). Take your reduced model, apply it to your process and see what you get.  Before you attempt to run the process using your equation, you will want to make sure the signs of the coefficients make physical sense and you will need to make sure everyone understands the model is based on happenstance data and failure of the model to actually identify better practices is to be expected.

0
#244328

Robert Butler
Participant

1. A quick look at the file you have attached strongly suggests you are trying to market your consulting services – this is a violation of the TOS of this site.

2. If you want to challenge the authors of the books/papers you have cited, the proper approach would be for you to submit your efforts to a peer reviewed journal specializing in process/quality control.

0
#244140

Robert Butler
Participant

I think you will need to provide more information before anyone can offer much in the way of suggestions.

1. You said, “The data we have received are sales numbers of 2016-2017-2018 accompanied with purchase number and of course the Sales price and purchase price.”

2. You also said, “We have to come up with a demand -forecasting model + re-order point for a company.”

Question: What is the level of detail of the sales numbers?  Monthly/weekly summaries? Actual sale-by-sale reporting for each of the three items and the day they were sold? Or…something else?

Question: What is the criteria for the need to re-order? X number of items 1,2,3 in non-working inventory, JIT delivery with a known lag time for meeting JIT requirements? Or…something else?

Question: Regardless of the level of detail, since it is time series data have you plotted the data against time?  If not, why not? If you have, what did you see?  Demand cycling? Steady state sales? etc.

If you can provide answers to these questions perhaps I or someone else might be able to offer some advice.

0
#243809

Robert Butler
Participant

In re-reading your comments it occurred to me there is an often overlooked but critical aspect to the issue of variable assessment using the methods of experimental design which should be mentioned.

In my experience, the worry that one might miss an important interaction when making choices concerning experiments to be included in a design is very often countered by the kind of people for whom the design was run. You should remember, for the most part, the people you are working with are the best your company/organization could get their hands on and they are very good at what they do.  When you give people of this caliber designed data you are presenting them with the following:

1. Because the data is design data they are guaranteed correlations between the variables and the outcomes actually reflect correlations which are independent of one another.  This means they do not have to worry about the inter-correlation relationships between the independent variables because they do not exist.

2. Because the data is design data they do not have to worry about some hidden/lurking/unknown variable showing up in the guise of one of the variables that were included in the study (you did randomize the runs – right?).

3. Because the data is design data if a variable does not exhibit a significant correlation with an outcome they are guaranteed, for the the range of the variable examined, the odds of that variable mattering are very low.

Given the above, and the insight these people have with respect to whatever process is under consideration, what you have is a variation on the saying about card playing – “a peek is as good as a trump.”  In other words the results of the analysis of this kind of data coupled with the knowledge your people bring to the table will very often make up for any shortcomings that were, of necessity, built into the original design.

What this means is if there is an interaction that was overlooked in the design construction their prior knowledge, coupled with design data will often result in the discovery of the untested interaction.  This kind of a finding will usually occur during the discussion of the design analysis when the results of the analysis get paired with their knowledge about process performance.  If and when this happens it has been often been the case that I was able to take the original design matrix and augment the runs with a couple of additional experiments. These experiments, when added to what was already run, allowed an investigation of the original terms along with an investigation of the “overlooked” interaction.

Note: If you run an augmentation to an existing design you will want to include one or two design points from the original as a check to make sure things haven’t changed substantially since the time of the initial design run.

0
#243712

Robert Butler
Participant

I’m not trying to be mean spirited or snarky but it’s called “experimental” design and not “infallible” design for a reason.  :-) Seriously, this is a standard problem and one every engineer/scientist/MD faces.  Your choices of variables to examine are, of necessity, based on what you know or think you know about a process and the number of experiments you run are usually determined by factors such as time/money/effort required.

The approach I use when building designs for my engineers/scientists/MD’s etc. is to first spend time looking over what data they might have (this is usually a sit down meeting) with respect to the issue at hand and discuss not only what we can see from plots and simple univariate analysis of the data but also what they, as professionals, think might need addressing.

In the majority of cases the final first design I build for them will be a modified screen with enough runs added to allow for a check for specific two-way interactions their experience suggests might be present.  In addition, if they think one or more of the variables might result in curvilinear behavior I’ll make sure to include enough design points to check that as well.

Oh yes, one additional item – from time-to-time someone (or a group of someone’s) will have in mind an experiment that they are just positive is really the answer to the problem.  Usually these “sure-fire” experiments are few in number (the sure-fire counts  of 1 and 2 are quite popular) so if they fall inside the variable matrix I will add them to the list of experiments to be run as well (and if they don’t fall inside then my first question is: Why haven’t we extended the ranges of the variables of interest to make sure we have covered the region of the sure-fire experiments?).

As for building a modified screen of this type my usual procedure is to use D-optimal methods to identify the experimental runs.  What you need to do is build a full matrix for all mains, two-way, and curvilinear effects, force the machine to accept a highly fractionated design for the mains, and then tell the machine to augment this design with the minimum number of points needed for the specific two-way and curvilinear effects of interest.  After you have this design then you can toss in the sure-fire points (if they exist) and you will be ready to start experimenting.

• This reply was modified 10 months, 4 weeks ago by Robert Butler. Reason: typo
0
#243697

Robert Butler
Participant

Since it sounds like the issue is that of identifying the “odd” line plot and given that the line plot you have shown is typical it would seem that a better approach would be to use the data you have to identify what looks like a “normal” envelope and then flag any curve that falls outside of that envelope.

Just looking at the graph it would appear you have the ability to identify an acceptable range of starting points (minimum to maximum X)  and if you take a slice through the thickest part of the plot where the graphs roll over it looks like you can define a range of acceptability there as well (minimum vs maximum Y).

A way to check this would be to take existing data – identify the unacceptable curves and see where they fall relative to what I assume is the “envelope” of acceptability.

0
#243684

Robert Butler
Participant

My understanding of your post is that you have a large number of histograms which summarize elapsed time measurements.  You have taken these histograms and run a curve fitting routine to try to draw the contour of the histogram.  You have taken these contour plots and overlaid them and are now confronted with trying determine curves that tend to follow a certain shape.

If the above is a fair description of what you are doing then a much easier approach to your problem would be a series of side-by-side boxplots.  They will give you the same information, you won’t have to worry about colors because all of the groups of data will be present.  Once you have these plots side-by-side it is an easy matter to spot similarities and differences in the boxplot patterns (you will want a boxplot routine that allows an overlay of the actual data points on the boxplot) and quickly assess such things as clustering, means, medians, standard deviations, etc.

Attached is an example to give you some idea of what I’m describing.  The similarities and differences are readily apparent.  For example, the elapsed times for 2 and 5 years look very similar as do those for 3 and 6 whereas 4 stands alone.

###### Attachments:
0
#243640

Robert Butler
Participant

@jazzchuck -that’s very interesting.  I guess I have mixed feelings about automating the dummy variable.  Question:  Does Minitab take the time to explain to you somewhere in the output what it has done for you and why?  If it doesn’t and you try running a regression using nominal variables in a package that doesn’t automatically generate a dummy variable matrix you are really going to go wrong with great assurance.  This might also explain why I have the sense that more people are just dumping nominal variable levels in a regression package, generating stuff that doesn’t make any sense and then telling me in front of a bunch of people how statistical analysis really won’t be of much value in terms of analyzing the problems “we” deal with.  :-)

0
#243631

Robert Butler
Participant

You can run a regression with the variables mentioned.  You need to remember the notion that regression variables (either X’s or Y’s) need to be continuous is just standard issue boiler plate and is overly restrictive.  The reason it is offered in many courses is to make sure you don’t go wrong with great assurance given that all you know about regression is what you learned in some two week course.

For the binary variables you can just code them as -1,1 and go from there.  For the nominal you will have to set up dummy variables.  What this amounts to is picking one of the nominal settings as the reference and then coding the others accordingly.  So,for example, if you had a 3 level nominal variable you would code one of them as (d1=0, d2 =0) and the other two would be coded (d1 = 1, d2 = 0) and (d1 = 0, d2 = 1) respectively and then you would run the regression using d1 and d2 in place of the nominal settings.

If you need a more thorough discussion of the subject I would recommend getting a copy of Regression Analysis by Example by Chatterjee and Price through inter-library loan and review the chapter on this subject.

0
#243104

Robert Butler
Participant

I don’t understand what you mean by “simulate”.

If you just want to play games with the variation around a .271 or a .300 batting average over time the easiest thing to do is take a thousand samples – in the first case 271 1’s and 729 0’s and in the second case 300 1’s and 700 0’s.  Put them in two columns with 271 1’s followed by 729 0’s (or the other way around – it doesn’t matter) and the same for the 300 and 700.  Then find a random number generator function out on the web and have it generate a random sequence of integers from 1 to 1000 (sampling without replacement so you only get each integer once). Do this twice and then match the random column one-to-one with the 0’s and 1’s column and then sort the two columns by the random number column so the random numbers are in sequence from 1 to 1000.  This will result in a random draw for the 0’s and 1’s and you can take any sub-sample of any size (1-50,  35-220, 14-700, etc.) and compute the batting averages for those sub-samples and see what you see.

1
#242608

Robert Butler
Participant

The old post cited below should provide some insight into the BOB, WOW issues.

https://www.isixsigma.com/topic/shainin-and-classical-doe-a-comparison/

0
#242322

Robert Butler
Participant

In other words, you have noticed some people are not interested in the results of a careful analysis and prefer to go with their gut reactions.  It’s been this way ever since homo sapiens emerged as a distinct species and, barring some natural selection pressure that favors logic over emotion, there’s not much you can do about it in the short term and, on an evolutionary time scale, you won’t be around when and if that kind of selection pressure arises and changes the species from homo sap to homo sensible.

0
#242305

Robert Butler
Participant

@cseider – you had to look it up???!!!  What else could CCPS be except Capacitor Charging Power Supply????  A rather esoteric subject but one, I’m sure, that must be of intense interest to some members of the IsixSigma community.

0
#242050

Robert Butler
Participant

For a paired frequency test you could have the students run a series of say 50 flips of a clean penny and then run a series of 50 spins of a clean penny to see if there is a significant difference in the odds of the penny coming up heads (or tails).

0
#242023

Robert Butler
Participant

Well then, for heavens sake – as a favor to your mental well being – don’t ask it!  ;-)

Seriously, I can’t decide if you are just venting, asking for civil ways to counter this question when it is put to you, or something else.

3
#241834

Robert Butler
Participant

Well, there’s more to it than that.  It is correct that you cannot use the generic curvilinear expression for purposes of prediction because there isn’t any way to assign that response to either of the two variables you have.  This means, for purposes of prediction, the best you can do is an equation with mains and interaction.  However, you don’t just remove the term and forget about what it is telling you.

If you are planning on using the equation you can construct for prediction purposes then there are several things you need to do in order to give yourself some understanding of the risks you are taking by using a predictive equation that has not accounted for all of the special cause variability in your data.

1. Residual plots – look at the residuals against the predicted – how far away from the zero line are the data points associated with the center points?  If the distance is “large” (yes, this is a judgement call) then you could go wrong with great assurance.  On the other hand if the points associated with the center points are “close to” but still on one side or the other then you are still not accounting for all of the special cause but, if you can’t do any more with respect to a few augmentation runs, it would be worth testing the equation to see how well it performs.

2. You need to scale and center your X’s in the manner I outlined in my first post and rerun the regression.  As I said, when you do this you will have a way to directly compare the effects of unit changes in the terms in the model.

Just looking at your equation based on the raw numbers my guess is that the units of X1 and X2 are large which would account for the small coefficient for the interaction.  I’m not sure what to make of the large coefficient for the generic squared term.  If you re-run the analysis with the scaled X1 and X2 you will be able to directly assess the impact of choosing to ignore the existence of curvature.

0
#241697

Robert Butler
Participant

There are two schools of thought concerning main effects and interactions – one school says if the interaction is significant you need to keep the main effects and the other one says you don’t.

As for the curvature:

a) I don’t see that term in your model

b) Assuming the equation is a typo – the curvature is generic which means all you know is that it exists therefore you will need to augment your design with one or perhaps two more experimental runs to identify the variable associated with the curvilinear behavior.

Something worth considering:

Take your two variables and scale them from -1 to 1 and re-run the analysis.

In order to do this you will need to find the actual minimum and maximum settings for each of the two variables and then do the following for each of the two variables.

A = (maximum + minimum)/2

B = (maximum – minimum)/2

Scaled X1 = (X1 – A1)/B1

Scaled X2 = (X2 – A2)/B2

If you re-run the analysis using the scaled terms you will get an equation that will allow you to directly compare the effects of a unit change in X1, X2, the generic squared term, and X1*X2 interaction.  What this buys you is the ability to choose between the two methods of final terms in the model.

For example, if the coefficients of the scaled X1 and X2 are small relative to the coefficient of X1*X2 then, since X1 and X2 alone are not significant nothing much will be lost in the way of model use.  On the other hand if they are equivalent then you might want to consider keeping them in the model.  Of course, all of this assumes you are also doing the usual things one should be doing with respect to assessing the differences in the two models – residual analysis, prediction accuracy, etc.

0
#241060

Robert Butler
Participant

No, I’m afraid this discussion is not an example of a complete misunderstanding of a control chart – Wheeler and Chambers know their stuff.

I agree with 5 data points I can’t apply all of the control chart tests but I can do this – I can get an estimate of the ordinary process variance by employing the method of successive differences, I can get an estimate of the overall mean, and I can make a judgement call with respect to the 5 data points I have as far as addressing the question of grossly out of control points.  When you have a process that generates 8 data points per run and when, at best, it is run twice a year, insisting on 30 data points before attempting an analysis means waiting a minimum of 2 years before doing anything – and that approach means you will have to run your process using the the old W.G.P G.A.* rule of thumb which is an excellent way to go wrong with great assurance.

A check of the back of any basic book in statistics will usually reveal a t-table. An examination of that table indicates the minimum number of data points needed for a t-test is 2.  With two data points you can get an estimate of the mean and the variance and you can test that sample mean to see if it is statistically significantly different from a target.

When you run a two level two variable design it is quite common to have a total of 5 experimental runs and it is also quite common to use the results of those five experiments to develop predictive equations for a process.  Once the equations have been built the first order of business is confirming their validity – typically this will consist of two or three process runs where the equations are used to identify optimum settings.  If the equations do a decent job of prediction (and they do this with appalling regularity) those equations will be incorporated in the methods used for process control.

In short, the empirical rule of needing 30 data points before running a rough statistical analysis has no merit.

*Wonder, Guess, Putter, Guess Again

1
#241055

Robert Butler
Participant

As others have noted – more data points are better when it comes to assessing out of control and for all of the reasons already mentioned.  However, waiting until you have more (we will arbitrarily define “more” as at least 20 data points) presumes the time/money/effort needed to get more data is forthcoming.  I’ve worked on any number of processes where the total number of data points available for constructing a control chart to assess process control was 8 or less and the hope of getting any more data points in a “reasonable” period of time (say a few days or maybe a few weeks) was a forlorn one.

These processes were batch type processes run on an as needed basis and “as needed” translated into elapsed times of 6 months or more. Since we needed to have an assessment of level of process control I used all of the usual methods for assessing variation in the presence of little data.  One of the big issues was that of the question of the in control nature of the first data point in the next cycle of needed production.  If it was “in control” we would run the process as set up.  If it was “out of control” we had no choice but to adjust the process. This was a decision that was not taken lightly because the effort required to adjust the process was involved, time consuming, and expensive.  What I found was that even with a control chart consisting of data from a single run I was able to use what I had to make accurate assessments of the process when we started running the next cycle of production.  Of course, once I had data from the second cycle it was immediately incorporated on a data-point-by-data-point basis and the chart was updated each time a new data point was added.

0
#240852

Robert Butler
Participant

Well @Straydog and @Mike-Carnell that guy @chebetz has found us out!  I guess it’s time we ask Mike Cyger to tell Katie to change our screen names again. I think I’ll go with Dr. Theodore Sabbatini this time around.  :-)

I can’t answer for Straydog or Mike but given our interactions over the years I’m pretty sure their answer will be the same as mine – specifically – no , the forum is not a day job and I don’t sit around waiting for the computer to signal me that someone has posted something new.

One of my engineers pointed me to this site back in 2002.  I looked over some of the answers to basic statistics questions and couldn’t believe how wrong many of them were.  So, with vicious self-interest as a driving force, I started answering various and sundry stat questions.  I enjoy responding because this gives me an excellent chance to not only revisit some of my textbooks for details but also practice the art of explaining difficult concepts to non-statisticians.  I will usually drop by in the morning before starting work and again in the evening after work to see if there is anything interesting.  If there is and I decide to respond I’ll usually do that during lunch or, if it is short enough, during a coffee break.  If the response is going to be too long or if I want the think about it I will wait until after work to write a response.

1
#240841

Robert Butler
Participant

I don’t see any reason why you can’t if you want to.  In most instances though the question will be: which customer spec limits are you going to include on your chart?  I worked in industry for many years before shifting over to medicine and I can’t recall a single product where we were making something for just one customer.

There is also the issue of having customer specs that are well inside your control limits but, because of their location, the fact that they are is of no major consequence.  For example, let’s say you manufacture polychrome fuzzwarts and the control limits for this sought after product are +- 3 polycrinkles.  Customer #1 has a spec limit of -3 or less to -1 polycrinkles, customer #2 has a spec limit of 2 polycrinkles to 3 or more polycrinkles and everyone else is happy with anything between from -3 to + 3 polycrinkles.  True, you will have to do lot selection for two of the groups but, if the cost of such an action is less than the cost of shifting the process when orders come in from the two populations then there is a good chance you will leave the process alone.  Under such circumstances the inclusion of customer spec limits will just clutter the graph.

1
#240799

Robert Butler
Participant

Something doesn’t seem right.  I would like to check the calculation. Could you post your total sample size and the values of the two averages?

0
#240669

Robert Butler
Participant

Let’s see if we can’t get you off of dead center.

You said the following:

Given a set of data, how do you conduct a baseline performance analysis? Is this the same as a capability analysis?

And, when looking at a set of data, how can you tell where the process changed?

My assignment asks me to complete a baseline performance analysis and “if the process changed, you must select where it changed and compute your answers.” It later says “Divide up the data as separated in the last question.” What does this mean?

1. Have you been given a data set to work with?

2. Is what you posted all that you have been given in the way of a question or is your post a paraphrase of the actual problem?

3. What is your understanding of the definition of the phrase “baseline performance” as it applies to a data set which (presumably) is supposed to represent process data acquired over some period of time?

a. Given what you have posted, does the phrase “baseline performance” relate to the process mean, variance, median, IQR, or some other aspect or some combination of the properties listed?

b. What kind of methods have you been taught to address questions concerning any of the items listed in a)?

c. What kind of methods have you been taught to address changes in any of the items listed in a)?

4. What is your understanding of what constitutes a capability analysis?

5. What is your understanding of what a capability analysis evaluates?

a. In any of your reading about capability analysis did any author conflate the phrases “capability analysis” and “baseline analysis?”

b. Have you taken the time to check these two phrases using a Google search?

1. If so, what did you find?

If you can provide the answers to these questions you should have a partial answer to your original question and if you post your answers here perhaps I or some other poster may be able to offer additional assistance.

1
#240548

Robert Butler
Participant

I’ve been away from any computer contact for the last 10 days, however, Happy belated birthday Mike… 19 July…hmmm interesting day that….

Circus Maximus catches fire – 64

15-year-old Lady Jane Grey deposed as England’s Queen after 9 days -1553

Astronomer Johannes Kepler has an epiphany and records his first thoughts of what is to become his theory of the geometrical basis of the universe while teaching in Graz -1595

5 people are hanged for witchcraft in Salem – 1692

HMS Beagle with Charles Darwin arrives in Ascension Island – 1836

1st railroad reaches Kansas – 1860

Franco-Prussian war begins – 1870

0
#240371

Robert Butler
Participant

I guess what I find most disturbing about the statement “these are actual questions from the exams” is the very real possibility of failing someone because they DID know the material and, conversely, passing someone who was ignorant of the correct answer.

0
#240357

Robert Butler
Participant

In this instance the question does not make sense.  The usual way one describes Type I and Type II error is as follows:

Type I error is the risk of rejecting something when it is true.

Type II error is the risk of accepting something when it is false.

So the risk of rejecting something when it is false doesn’t meet the definitions of either Type I or Type II error and Type III error is providing the right answer to the wrong question.

In this case the null is: No improvement in the process occurred

The alternate is hypothesis is : Improvement in the process occurred.

So if improvement did occur in the process then the null was false and we correctly rejected it and we did not commit either Type I or Type II error.

0
#240341

Robert Butler
Participant

It looks to me like they’ve screwed up again.  Answer C is the estimate of the standard deviation (24/6).  Your calculations are correct.

0
#240320

Robert Butler
Participant

I’m sure they do. Here’s the issue

1. What exactly is a 32 factorial design? I’m not aware of any reference that expresses design nomenclature in this manner.

The usual practice is to say say things such as the following:

1. I have a two level design with 3 factors
2. I have a full 2 level factorial design for 5 variables for a total run of 32 experiments.
3. I have a saturated design of 32 runs to investigate 31 variables in 32 experiments.

As for the claim concerning the correctness of B – just look at the graph – the greatest tool age occurs with all 3 variables at their highest settings. The only way you could justify the claim would be if you had an interaction plot showing the effects of cutting speed and metal hardness and on THAT graph you had a situation where the combination of low cutting speed and high metal hardness resulted in greater tool aging. You cannot deduce answer B from an examination of those main effects plots.

You are welcome to ask them but, given my experience with such entities it will be nothing more than contrivindum non pissandum.

• This reply was modified 1 year, 3 months ago by Robert Butler. Reason: typo
0
#240316

Robert Butler
Participant

I would say C,D,E and I think C is a typo – my guess is that they meant to have 3*2 = 6 experiments.

Actually, there is another possibility – but if it is the case then the problem is poorly presented and C still remains a typo. Since they do claim it is a factorial design one could take a saturated 3 variable in 4 experiments design and augment it with two additional runs. This would still be a factorial design in 6 experiments but not if we are abiding by the standard definition (which I’m assuming they are) of a factorial design. So, if they mean a factorial design in the usual sense, then it would be a 2**3 design which still means C is a typo.

Neither A or B are correct.

0
Viewing 100 posts - 1 through 100 (of 2,504 total)