iSixSigma

Robert Butler

Forum Replies Created

Viewing 100 posts - 1 through 100 (of 2,399 total)
  • Author
    Posts
  • #239875

    Robert Butler
    Participant

    I’m not trying to be sarcastic but who is making the claim that linear plots are mostly used in [an examination?] of a linear regression?

    There are any number of reasons why an analysis might result in employing simple linear plots. For example:
    1. Your plan for analysis involves running a two level factorial design. – In this case, by definition, a simple linear plot is all you are going to get.
    2. A plot of your production data says the the trend is linear – therefore if you run a regression on the data linear is going to be the best fit.
    3. Even though your knowledge of the process says the relationship between your independent and dependent variables is some sort of curvilinear process, the region where you happen to be running your process is a region where the relationship between the independent and dependent variables are essentially linear.

    If you are going to look at curvilinear trending then, at the very minimum, you need to have 3 levels for your independent variables. One of the reasons you run center points on two-level factorial designs (we are assuming the X’s are or can be treated as continuous) is to check for possible deviations from linear. If you find this to be the case then one will often augment the existing design to check for some kind of curvilinear response (usually quadratic/logarithmic)and your final plots will be curvilinear in nature.

    As for cubic (or higher) – the first question you need to ask is when would you expect to encounter such behavior? If you are actually working on something and plots of your production data do suggest you have a real cubic response then you would want to build a design with 4 levels for the various continuous independent variables.

    In all of the years of process analysis the only time I ever encountered a genuine cubic response was when we found out the region we were examining resulted in a phase change of the output (we were adding a rubberized compound to a plastic mix and we crossed the 50% threshold which meant we went from rubberized plastic to plasticized rubber). In short, in my experience, real cubic and higher curvilinear responses are extremely rare.

    There is, of course, data that is cyclic in nature such as monthly changes in demand for a product and I suppose you could try to model data of this type using linear regression methods but the first line of defense when attacking this kind of data is time series analysis and a discussion of that tool is, I think, beyond the focus of your question.

    • This reply was modified 7 hours, 26 minutes ago by  Robert Butler.
    0
    #239801

    Robert Butler
    Participant

    I see I spoke too soon. What I said above still applies with respect to the graphical presentation and sampling plan you should use when repeating your test but since you provided the actual data there are a couple of other things to consider.

    Again, if we pretend the sequential measures within a given operator are independent (they’re not) then for all of the data we find there is a significant difference between shift means 1817 and 2826 (P < .0001) and we also find there is a significant difference in shift variability with standard deviations of 193 and 98 respectively (P = .016).

    A quick check of the boxplots suggests the single measure of 2280 in the first shift is the cause for the differences in standard deviations. If we re-run the analysis with that value removed what we get are means of 1784 and 2826 – still significantly different (P < .0001) and standard deviations of 150 and 98 (P = .12). So, overall you first shift has a significantly lower mean scoop measure and a numerically higher (but not significantly different) variability.

    You could go ahead and try to parse the data down to the individual operator but with just five repeated measures per person I wouldn’t recommend doing this for two reasons 1) the measures within each operator are not independent and 2) even if the measures were independent it is highly unlikely that the variability of a sequential 5-in-a-row sample is at all representative of the variability one would see within a given operator over the course of a shift.

    I would recommend summarizing your findings in the form of the graph below along with a few simple summary statistics of the means and the variances and then, before trying to do anything else – address and solve the issues you mentioned in your second post and then re-run the sampling plan in the form I mentioned above.

    Attachments:
    0
    #239799

    Robert Butler
    Participant

    I don’t know Minitab but I do know they have boxplots graphics which I would recommend you use. You would have side-by-side boxplots of the two shifts. What you will want to do is to make boxplots which also shows the raw data that way you will not only have the boxplot with its summary statistics but you will be able to see visually just how the data distributes itself with each shift. See Attached for example.

    For your repetition you will want to take five measurements but not back-to-back – rather randomize the pick through the time of the shift. This will mean extra work for you but it will guarantee that your measurements are representative of what is happening in the course of a shift.

    Attachments:
    0
    #239789

    Robert Butler
    Participant

    Just a visual impression of your data says you have a statistically significant difference in mean scoop size between first and second shift. If we check it formally and pretend the measures within each team member can be treated as independent measures then what you have is 1.22 vs 1.88 (which, given the limit on significant digits reduces to 1.2 vs 1.9) with corresponding standard deviations of .168 and .172 (which rounds to .17 and .17). There is a significant difference in the means (P <.0001) and there is no significant difference (P = .92) with respect to shift variability.

    One thing you should consider. The above assumes the measures within each team member are independent. If the measures you have reported are random draws from a single shift work for each team member then it is probably reasonable to treat the measures as independent within a given member. On the other hand, if the measures were just 5 scoops in sequence then you are looking at repeated measures which means your estimates for shift variance are much smaller than they really are.

    Before you do anything else you need to understand the cause of the difference in scoop means between shifts. If the material is uniform, the front end loaders are the same, the 6 individuals all have the same level of skill in handling a front end loader and all other things of importance to the process are equal between the two shifts then you shouldn’t have a difference like the one you have in your data.

    • This reply was modified 5 days, 16 hours ago by  Robert Butler.
    0
    #239709

    Robert Butler
    Participant

    Wellll…not counting the mandatory military service I don’t think there’s been any time in my career when I wasn’t involved in process improvement. I started off as a physicist working in advanced optics – the work was a combination of R&D and improving existing systems/processes. One of the groups I worked with had an assigned statistician. I watched what he was doing and liked what I saw so I went back to school and earned my third piece of paper in statistics.

    My role as a statistician in industry was to provide support for R&D and process improvement in specialty chemicals, aerospace, medical measurement systems, and plastics. At my last company our major customer demanded we have six sigma black belts working on their projects. Our management picked three of us to go for training and I was one of them. The end result was I wound up running projects as well as providing statistical support to anyone who needed it.

    Mr. Bin Laden’s actions a few year back got me (along with a number of other people) laid off from that aerospace company so I went job hunting, found a listing for a biostatistician at a hospital, applied, and was accepted. It’s a research hospital but, just like industry, there are process improvement problems so, as before, I work on both R&D and process improvement issues.

    2
    #239658

    Robert Butler
    Participant

    You’re welcome. I hope some of this will help you solve your problem.

    0
    #239629

    Robert Butler
    Participant

    I don’t know the basis for your statement “I know that in order to be able to do the Hypothesis Testing we must also be able to test for Normality and the “t-test”. but it is completely false. There is no prerequisite connection between the desire to test a random hypothesis and ANY specific statistical test. Rather, it is a case of formulating a hypothesis, gathering data you believe will address that hypothesis and then deciding which statistical test or perhaps group of tests you should use to address/test your initial question/hypothesis.

    If you have run afoul of the notion that data must be normally distributed before you can run a t-test – that notion too is wrong. The t-test is quite robust when it comes to non-normal data (see pages 52-58 The Design and Analysis of Industrial Experiments 2nd Edition – Davies). If the data gets to be “crazy” non-normal then the t-test may indicate a lack of significant difference in means when one does exist. In that case check the data using the Wilcoxon-Mann-Whitney test.

    As to what constitutes “crazy” non-normal – that is something you should take the time to investigate. The best way to do that would be to generate histograms of your data, and run both the t-test and the Wilcoxon tests side-by-side and see what you see. You can experiment by generating your own distributions. If you do this what you will see is that the data can get very non-normal and, if the means are different, both the t-test and the Wilcoxon will indicate they are significantly different (they won’t necessarily have the same p-value but in both cases the p-value will meet your criteria for significance).

    With regard to the data you have posted – one of the assumptions of a two sample t-test is that the data samples be independent. Your data strongly suggests you do not have independent samples. As posted, what you have are a group of 7 officials who have tried something using an old and a new method. The smallest unit of independence is the individual official. Therefore, if you want to use a t-test to check this data what you will have to do is use a paired t-test.

    The paired t-test takes the differences within each individual official between the old and the new method and asks the question: Is the mean of the DIFFERENCES significantly different from 0. What this will tell you is, within the group of officials you have tested, was the overall difference in their performances significantly changed – which is to say is the distribution of the DIFFERENCES really different from 0.

    If you want to ask the question – is method new really better than method old you will have to take measurements from a group of officials running the old method, get a second group of officials and have them run just the new method and then run a two sample t-test on the two distributions. If, as I suspect, these methods require training you will need to make sure the two independent groups of officials have the same level of training in both methods.

    0
    #239598

    Robert Butler
    Participant

    As Kaplan put it in his book The Conduct of Inquiry, ” When attacking a problem the good scientist will utilize anything that suggests itself as a weapon.” :-)

    0
    #239447

    Robert Butler
    Participant

    If we couple what you said in your last post with an assessment of the graphs you have generated I think you might have the answer to your question.

    As I said previously, the normal probability plots of cycle time days by department is a very important graph. What it shows is that the cycle time data for 4 out of the 5 departments is distinctly bimodal. When you examine the bimodality what you find for Finance, Sales, and Product Plan is a clear break in the distributions of reports with low cycle times and the distribution of the reports with high cycle times. The break occurs around 4-5 days. The Prod Dept, probably by virtue of the large report volume relative to the other 3, does have an obvious bimodal distribution with a break around 5 days. One could argue that the Prod Dept plot is tri-modal with a distribution from 1-3 days, a second from 3-6 days and a third with 6 days or more, however, given the data we have I wouldn’t have a problem with calling the Prod Dept data bimodal with a break between low and high cycle time reports at around 5.

    The Exploration Department is unique not only because its report cycle time data is not bimodal but also because it appears that only one report has a cycle time of greater than 5 days.

    What the normal probability plots are telling you is whatever is driving long report cycle times is independent of department. This is very important because it says the problem is NOT department specific, rather it is some aspect of the report generating process across or external to the departments.

    Now, from your last post you said, “Data for this report is received from the production department. Out of this data, 18 were clean while 50 were dirty…” If I bean count the data points on the normal probability plot for Prod Dept for the times of 5 days or less I get 18. This could be just dumb luck but it bears investigation.

    For each department – find the reports connected with dirty data and the reports connected with clean data – data error type does not matter. If the presence of dirty data is predominately associated with reports in the high cycle time distribution (as it certainly seems to be with the Prod Dept data) and if, it turns out that the Exploration Dept data has no connection (or, perhaps a single connection with the 6 day cycle time report) then I think you will have identified a major cause of increased report cycle time.

    If this turns out to be true and someone still wants a hypothesis test and a p-value the check would be to take all of the reports for all of the departments, class them as either arising from clean or dirty data, and then class them as either low or high cycle times depending on which of the two distributions with each of the departments they belong (which is which will depend on the choice of cut point between the two distributions – I would just make it 5 days for everyone – report cycle time <=5 = low and >5 = high). This will give you a 2×2 table with cycle time (low/high) on one side of the table and data dirty/clean on the other. If the driver of the longer cycle times for reports is actually the difference between clean and dirty data then you should see a significant association.

    1
    #239440

    Robert Butler
    Participant

    There’s no need to apologize. You have an interesting problem and I just hope that I will be able to offer some assistance. To that end, I may be the one that needs to apologize. I was re-reading all of your attachments and, in light of what you have in the most recent attachment, I’m concerned that my understanding of your process is incorrect.

    My understanding was you had 5 departments in the CD group and that any one of the 5 could produce any one of the 8 types of reports and could expect to deal with any one of the four types of data errors. In reviewing the normal probability plots of the departments it occurred to me that there is a large difference in data points for the probability plot for the production department when compared to all of the others. This, coupled with the large count for Prod Volume reports is the reason for my concern about a dependence between department and report type.

    My other assumption was that the counts for the types of error occurrence was “large” which is to say counts in the realm of >25, however, your most recent attachment indicates the boxplots for the types of error are summaries of very small groups of data.

    So, before we go any further – are the assumptions I’ve made above correct or are they just so much fantasy and fiction?
    In particular:
    1. What is the story concerning department and report generation?
    2. Do any of the departments depend on output from the other departments to generate their reports?
    3. What is the actual count for the 4 error categories
    4. Why is it that production department has so many more data entries on the normal probability plot than any of the other departments?

    If I am wrong (and I think I am) please back up and give me a correct description of the relationships between departments, report generation, and types of error a given department can expect to encounter.

    One thing to bear in mind, the graphs you have generated are graphs that needed to be generated in order to understand your data so if a change in thinking is needed the graphs will still inform any additional discussion.

    0
    #239431

    Robert Butler
    Participant

    As @Straydog has noted references to standards in measurements has a very long history. The link below suggests referring to such checks as MSA occurred sometime in the 1970’s

    https://www.q-das.com/fileadmin/mediamanager/PIQ-Artikel/History_Measurement-System-Analysis.pdf

    1
    #239427

    Robert Butler
    Participant

    A few general comments/clarifications before we proceed
    1. You said “I guess I was looking at the p-value because I know that would determine the kind of test I run eventually.” I assume this statement is based on something you read – whoever offered it is wrong – p-values have nothing to do with test choices.
    2. When I made the comment about the boxplots and raw data what I meant was the actual graphical display – what you have are just the boxplots. Most plotting packages will allow you to show on the graph both the boxplot and the raw data used to generate the boxplots and it was this kind of graphical presentation I was referencing.
    3. Cycle time is indeed continuous so no problems there and your departments/report types etc. are the categories you want to use to parse the cycle times to look for connections between category classification and cycle time.

    Steps 1- 3

    Overall Data Type

    Your first boxplot of data types – no surprises there – however keep this plot and be prepared to use it in your report – it provides a quick visual justification for focusing on dirty data and not wasting time with DR data

    Error Type

    The graph says the two big time killers are, in order, duplicated data (median 7.1) and aggregated data (looks like mean and median are the same 4).

    Report Type
    The biggest consumers of time, in order, are Prod Volume (median 5.995), Balances (median 5.79) and Demand Volume (median 4.7). So from a strictly univariate perspective of time these would be the first place to look. However, this approach assumes equality with respect to report volumes. If there are large discrepancies in the report output count then these might not be the big hitters. What you should do is build a double Y graph with the boxplot categories on the X axis, the cycle time on the left side Y axis (as you have it now) and report volume on the right-hand Y axis. This will give you a clear understanding both time expended and quantity. If you don’t have a double Y plotting capability, rank order the boxplots in order of number of generated reports and insert the count either above or below the corresponding boxplot. Knowing both the median time required and the report volume will identify the major contributors to increased time as a function of report type.

    Department Type

    Same plot as before and I have nothing more to add as far as this single plot is concerned (more on this later).

    Normality plots

    These tell you a lot. With exception of the exploration department all of the other departments have a bimodal distribution of their cycle times. Distribution ranges: Finance – 1-4 and 5-7 days Prod Plan – <1 – 4 and 6-8 days, Sales <1-3 and approx. 4.5 – 5 days. Prod department may have a tri-modal distribution 0-3, 3- <6, and 6-9 days

    Based on your graphs you know the following:
    1. For each department you know the data corresponding to the greatest expenditures of time. For Finance it is the 5-7 day data, for Prod Plan it is the 6-8 day data, for sales it is the 4.5-5 day data, and for prod department it is the 6-9 day data.
    2. You know the two biggest types of error with respect to time loss.
    3. Given your graphical analysis of the report types and volumes you will have identified the top contributors to time loss in this category.

    Your first statistical check:
    Since you are champing at the bit to run a statistical test here’s your first. Bean count the number of time entries in each of the 4 error types. Note – this is just a count of entries in each category, the actual time is not important.,
    1. Call the counts from aggregates and duplicate data error group 1 and the counts from categories and details group two. Run a proportions test to make sure that the number of entries in group 1 are greater than the number of entries in error group 2. (This should make whomever is demanding p-values happy)

    Given that there really are more entries in aggregate and duplicate and given that you know from your graphical analysis which of the report types are the big hitters, pareto chart those reports, take the top two or three, and then run your second check by grouping the reports into two categories – the top three (report group 1) and all of the others (report group 2).

    Now run your second check on proportions – a 2×2 table with error groups vs report groups. This will tell you the distribution of error groups across report groups. You may or may not see a significant difference in proportions.

    Either way – what you now want to do is take the two report groups and the two error groups and look at their proportions for the two regions for Finance, Prod Plan, Sales, and Prod Department. What you are looking for is a differentiation between the short and long times for each department as a function of report type and error type. To appease your bosses you can run proportions tests on these numbers as well.

    If the error types and the report types are similar across departments for the longer elapsed times then you will know where to focus your efforts with respect to error reduction and report streamlining. If they are quite different across departments then you will have more work to do. If there isn’t a commonality across departments I would recommend summarizing what you have found and reporting it to the powers that be because this kind of finding will probably mean you will have to spend more time understanding the differences between departments and what specific errors/reports are causing problems in one department vs another.

    2
    #239424

    Robert Butler
    Participant

    The histograms and the boxplots suggest some interesting possibilities. Based on what you have posted I’m assuming your computer package will not let you have a boxplot which includes the raw data. On the other hand, if there is a sub-command to allow this you really need to have a plot that shows both.

    If my understanding of your process is correct namely the process flow is Draft Report and then it goes to Cleanse Data the histograms suggest a couple of possibilities:
    1. The Draft Report (DR) phase is completely independent of the Cleanse Data (CD) phase
    2. The folks in the DR phase are ignoring all of the problems with respect to data and are just dumping everything on the CD group.

    Of course, if it is the other way around – first you need to cleanse the data and then you draft a report then what you are looking at is the fact that, overall, the CD group did their job well and DR gets all of the benefits of their work and kudos for moving so quickly.

    My take on the histograms and the comment on the Multivar analysis is that for all intents and purposes both DR and CD have reasonably symmetric distributions. What I see is a very broad distribution for CD. I don’t see where there is any kind of skew to the left worth discussing.

    To me the boxplots indicate two things:
    1. Production Planning has the largest variation in cycle time
    2. Production Department has the highest mean and median times and appears to have some extreme data points on the low end of the cycle time range.

    My next step would be the following:
    1. If you can re-run the boxplots with the raw data do so and see what you see.
    2. Plot the data for all 5 of the CD departments on a normal probability paper and see what I can see. These plots will help identify things like multimodality.
    3 The boxplots suggest to me that the first thing to do is look at the product department and product planning.
    4. Go back to the product department data and identify the data associated with the extremely low cycle times and see if there is any way to contrast those data points with what appears to be the bulk of the data from the product department.
    a. Remove those extreme data points and re-run the boxplot comparison across departments just to see what you see.
    5. Take a hard look at what both prod department and product planning do – are they stand alone (I doubt this)? Do they depend on output from other departments before they can begin their work? If they do require data from other departments what is it and what is the quality of that incoming work? Do they (in particular product department) have to do a lot of rework because of mistakes from other departments? etc.

    The reason for the above statements is because the boxplots suggest those two departments have the biggest issues the first because of greater cycle time and the second because of the broad range. The other thing to keep in mind is that prod planning has cycle times all over the map – why? This goes back to what I said in my first post – if you can figure out why prod planning has such a spread you may be able to find the causes and eliminate them. There is, of course, the possibility that if you do this for the one department what you find may carry over to the other departments as well.

    1
    #239405

    Robert Butler
    Participant

    If your post is an accurate description of your approach to the problem then what you are doing is wrong and you need to stop and take the time to actually analyze your data in a proper statistical manner.

    The first rule in data analysis is plot your data. If you don’t know what your data looks like then you have no idea what any of the tests you are running are telling you.

    So, from the top, I’d recommend doing the following:

    You said, “I have … collected data for 2 main activities within the process for analysis – the cycle times for ‘draft report’ and ‘cleanse data’ 9with information on report type, department, error type, etc.). I did a multi-vari analysis which indicates that a large family of variations are from the ‘cleanse data’ activity.”

    Run the following plots:

    First – side by side boxplots of cycle times (be sure to select the option that plots the raw data and not just the boxplots themselves) split by activity. Do the same thing with overlaid histograms of cycle times by activity type.
    What this buys you;
    1. An assurance that the “large family of variations from the cleanse data” really is really a large family of variation and not something that is due to one or more of the following: A small group of influential data points, multi-modalities within cleanse data and/or within draft report whose overall effect is to give an impression of large variation within one group when, in fact, there really isn’t, some kind of clustering where one cluster is removed from the rest of the data, etc.

    Second – side by side boxplots within each of the two main categories for cycle time where the boxplots are a function of all of the things you listed (“report type, department, error type, etc.”). What you want to do here is have boxplots on the same graph that are plotted as (draft report (DR) report type A), (cleanse data (CD) report type A), (DR report type B, CD report type B)…etc.)
    What this buys you:
    1. An excellent understanding of how the means, medians, and standard deviations of reports, department, error type, etc are changing as a function of DR and CD. I am certain you will find a lot of commonality with respect to means, medians, and standard deviations between DR and CD for various reports, departments, etc. and you will also see where there are major differences between DR and CD for certain reports, departments, etc.

    2. The boxplots will very likely highlight additional issues such as data clustering and extreme data points within and between DR and CD. With this information you will be in a position to do additional checks on your data (why clustering here and not there, what is the story with the extreme data points, etc.)

    3. If the raw data on any given boxplot looks like it is clustered – run a histogram and see what the distribution looks like – if it is multimodal then you will have to review the data and figure out why that is so.

    Third – multivar analysis – I have no idea what default settings you used for running this test but I would guess it was just a simple linear fit. That is a huge and totally unwarranted assumption. So, again – plot the data. This time you will want to run matrix plots. You will want to choose the simple straight line fit command when you do these. The matrix plots will allow you to actually see (or not) if there is trending between one X variable and cycle time and with the inclusion of a linear fit you will also be able to see a) what is driving the fit and b) if a linear fit is the wrong choice.

    1. You will also want to run a matrix plot of all of the X’s of interest against one another. This will give you a visual summary of correlations between your X variables. If there is correlation (which, given the nature of your data, there will be) then that means you will have to remember when you run a regression that your variables are not really independent of one another within the block of data you are using. This means you will have to at least check your variables for co-linearity using Variance Inflation Factors (eigenvalues and condition indices would be better but these things are not available in most of the commercial programs – they are, however, available in R).

    Once you have done all of the above you will have an excellent understanding of what your data looks like and what that means with respect to any kind of analysis you want to run.

    For example – You said “I want to know with 95% confidence if the cycle time for cleanse task process is influenced by the source department that prepared the original data file with HO: µD1 = µD2 =µD3 = µD4 = µD5 and HA: At least one department is different from the others. Tests: Based on my A-D normality test done on the cycle time data, I concluded it was not normal and proceeded to test for variance instead of mean by using levene’s test since I have more than 3 samples (i.e.5 departments)”

    Your conclusion that non-normality precludes a test of department means and thus forces you to consider only department variances is incorrect.

    1. The boxplots and histograms will have told you if you need to adjust the data (remove extreme points, take into account multimodality, etc). before testing the means.

    2. The histograms of the five departments will have shown you just how extreme the cycle time distributions are for each of the 5 departments. My guess is that they will be asymmetric and have a tail on the right side of the distribution. All the Anderson-Darling test is telling you is that your distributions have tails that don’t approximate the tails of a normal distribution – big deal.

    Unfortunately, there are a lot of textbooks, blog sites, internet commentary, published papers which insist ANOVA needs to have data with a normal distribution. This is incorrect – like the t-test, ANOVA can deal with non-normal data – see The Design and Analysis of Industrial Experiments 2nd Edition – Davies pp.51-56 . The big issue with ANOVA is variance homogeneity. If the variances across the means are heterogeneous then what might happen (this will be a function of just how extreme the heterogeneity is) is ANOVA will declare non-significance in the differences of the means when, in fact, there is a difference. Since you have your graphs you will be able to decide for yourself if this is an issue. If heterogeneity is present and if you are really being pressed to make a statement concerning mean differences you would want to run Welch’s test.

    On the other hand, if you do have a situation of variance heterogeneity across departments, what you should do first is forget about looking for differences in means and try to figure out why the variation in cycle time in one or more departments is significantly worse than the rest and since you have your graphs you will know where to look. I guarantee if heterogeneity of the variances is present and if you can resolve this issue the impact on process improvement will be far greater than trying to shift a few means from one level to another.

    1
    #239378

    Robert Butler
    Participant

    Please understand I’m not trying to be snarky, condescending, or mean spirited but your post gives the impression that all you have done is dump some data in a machine, hit run, and dumped out a couple of statistics that suggest your data might not exactly fit a normal curve. To your credit it would appear you do know that one of the requirements for a basic process capability study is data that is approximately normal.

    Let’s back up and start again.

    I don’t have Minitab but I do know it has the capability to generate histograms of data and superimpose idealized plots of various distributions on the data histogram.

    Rule number 1: Plot your data and really look at it. One of the worst things you can do is run a bunch of normality tests (or any kind of test for that matter) and not take the time to visually examine your data. One of the many things plotting does is put all of the generated statistics in perspective. If you don’t plot your data you have no way of knowing if the tests you are running are actually telling you anything – they can be easily fooled by as little as one influential data point.

    So, having plotted your data:

    Question 1: What does it look like?
    Question 2: Which ideal overlaid distribution looks to be the closest approximation to what you have?

    Since you are looking at cycle time you know you cannot have values less than zero. My guess is you will probably have some kind of asymmetric distribution with a right hand tail. If you wish you can waste a lot of time trying to come up with an exact name for whatever your distribution is but given your kind of data it is probably safe to say it is approximately log normal and for what you are interested in doing that really is all you have to worry about.

    One caution: if your histogram has a right tail which appears to have “lumps” in it – in other words it is not a “reasonably” smooth tapering tail you will want to go back to your data and identify the data associated with those “lumps”. The reason being that kind of a pattern usually indicates you have multiple modes which means there are things going on in your process which you need to check before doing anything else.

    If we assume all is well with the distribution and visually it really is non-normal then the next thing to do is choose the selection in Minitab which computes process capability for non-normal data. I don’t know the exact string of commands but I do know they exist. I would suggest running a Google search and/or going over to the Minitab website and searching for something like capability calculations for non-normal data.

    1
    #238893

    Robert Butler
    Participant

    I had to look up the definition of IIOT – and the manager in charge of obfuscation proliferation strikes again … :-). If all you have is paper entries then someone somewhere is going to have to sit down and transcribe the data into some kind of electronic spreadsheet.

    I suppose their might be some kind of document scanner that could do this but my guess would be a scanner like that would require uniformity with respect to paper entries – something like a form that is always filled out the same way. Even if you could do that there would have to be a lot of data cleaning because paper records, even when in a form, can and will have all sorts of entries that do not meet the requirements for a given field. For example – say one field is supposed to be a recorded number for some aspect of the process. I guarantee you you will get everything from an actual number, for example 10, to anything approximating that entry: 10.0 , >10, <10, ten, 10?, na, ” ,etc.

    There are some ways to approach a data set of this kind. Since the process will have changed over time (and, most likely, the reporting and measurement systems will have as well) the first block of data to examine would be the most recent. Let’s say you have a year’s worth of data for a given process. First talk to the people doing the recording. The focus will be on their memories with respect to times during the last year when the process was functioning properly and times when it wasn’t.

    Using their memories as a guide, go into the paper records and pull samples from those time periods, find someone who can do the data entry, and transcribe the data and analyze what you have. If the memory of the people on the line is any good (and in my experience it is usually very good) your sample will provide you with an excellent initial picture of what the process has done and where it is at the moment.

    1
    #238892

    Robert Butler
    Participant

    This looks like a homework problem so rather than just give you the answers – let’s think about how you would go about addressing your questions.

    Question 1: How do you determine the average effect of A?
    Question 2: How do you determine the average effect of B?
    Question 3: If you have the average effect of A – what is another term for that average?
    Question 4: If you have the average effect of B – what is another term for that average?
    Question 5: How do you know that your average effect of A is independent of your estimate of your average effect of B?
    Question 6. If you wanted to know the effect of the interaction AxB what would you do to your design matrix to generate the column for AxB?
    Question 7: Given the column for AxB how would you determine its average effect?
    Question 8: What is another term for the average effect of AxB?

    If you can answer these questions you will not only have the answers to the questions you have asked but you will have an excellent understanding of the power of experimental design methods over any other kind of approach to assessing a process.

    4
    #238766

    Robert Butler
    Participant

    If that’s the case then the issue should be framed and analyzed in terms of what you mean by on-time delivery.

    1. What do you mean by on-time delivery? – need an operational definition here
    2. What does your customer consider to be on-time delivery? – need an operational definition here too.
    3. What do you need to take into account to adjust the on-time delivery definition for different delivery scenarios?
    a. If it is the physical delivery of a product to a company platform – how are you addressing things like, size of load, distance needed for travel, etc.
    b. If it is a matter of delivering something electronically – how are you adjusting for things like task difficulty, resource allocation to address a problem, etc.

    After you have your definitions and an understanding of how your system is set up for production you should be able to start looking for ways to assess your on-time delivery and also identify ways to improve it.

    0
    #238625

    Robert Butler
    Participant

    Not that normality has anything to do with control charting but, in your case, 100% represents a physical upper bound. If your process is operating “close” to the upper bound then the data should be non-normal – which would be expected. If it is “far enough” away from 100% then I could see where the data might be approximately normal – again not an issue – but if your process is that removed from 100% then it would seem the first priority would be to try to find out why this is the case instead of tracking sub-optimal behavior.

    To your specific question, since 100% is perfection and since you want to try tracking the process with a control chart what you want is a one sided control chart where the only interest is on falling below some limit.

    1
    #238524

    Robert Butler
    Participant

    Ok, that answers one of my questions. I still need to know what constitutes a rigorous design. Given your answer to the question of order I’m guessing all rigorous means is that you actually take the time to implement the design and not stop at some point and shift over to a WGP analysis (wonder-guess-putter), however, I would still like to know what you actually mean by “rigorous”.

    To your first question concerning two-three level factorial designs or any of the family of composite designs, the biggest issues for running a design are the issues that drove you to wanting to run a design in the first place – time, money, effort. What this typically means is taking the time to define the number of experiments you can afford to run and then constructing a design that will meet your constraints.

    From a political standpoint the biggest issue with running an experimental design is convincing the powers that be that this is indeed the best approach to assessing a problem. You need to remember that simple one-variable-at-a-time experimentation is what they were taught in college, it is what got them to the positions they occupy today and, of course, it does work.

    If you are in a production environment where you will be running the experiments in real time you will have to spend a lot of time not only explaining what you want to do but you will also have to work with the people involved in the production process to come up with ways of running your experiments with minimal disruption to the day-to-day operation. The political aspects of design augmentation can be difficult but, as someone who has run literally hundreds of designs in the industrial setting, I can assure you that the issues can be resolved to everyone’s satisfaction.

    0
    #238508

    Robert Butler
    Participant

    You will have to provide a little more detail before anyone can offer much.
    1. What is a rigorous experimental design?
    2. What are lower and higher orders?

    1
    #238438

    Robert Butler
    Participant

    Give two levels for each factor you have the following:
    Factor A, df = (a-1)
    Factor B, df = (b-1)
    Factor C, df = (c-1)
    Factor D, df = (d-1)

    in each case a = b = c = d = 2

    For each interaction
    AxB, df = (a-1)*(b-1)
    AxC, df = (a-1)*(c-1)
    etc.
    AxBxC, df = (a-1)*(b-1)*(c-1)
    etc.
    and
    AxBxCxD = (a-1)*(b-1)*(c-1)*(d-1)
    Error, df = a*b*c*d*(n-1) where n = number of replicates
    Total df = a*b*c*d*n -1

    As an aside – 3 full replicates is gross overkill. The whole point of experimental design is minimum effort for maximum information. Unless you have a very unusual process it is unlikely that 3 and 4 way interactions will mean anything to you physically. Under those circumstances just take the df associated with the three and the 4 way interactions and build a model using only mains and two ways – this dumps all of the df for the higher terms into the error and instead of running 48 experiments you will have run 16.

    The other option, assuming you actually have a process where 3 way and 4 way interactions are physically meaningful and the variables under consideration can all be treated as continuous, is to replicate just one or perhaps two of the design points – this will give you df for error and will also allow you to check the higher order interactions. Given that they can be treated as continuous a better bet would be to just add 2 center design points to your 2^4 design. This will give you a run of 18 experiments, an estimate of all mains, 2,3, and 4 way interactions, an estimate of error, and a check on the possible curvilinear behavior.

    1
    #238179

    Robert Butler
    Participant

    1. Control charts do not require normal data – see page 65 Understanding Statistical Process Control 2nd Edition – Wheeler and Chambers
    2. When you have the kind of count data you have it can be treated as continuous data – see page 3 Categorical Data Analysis 2nd Edition – Agresti
    3. I-MR would be the chart of choice – you should check to make sure your data is not autocorrelated. If it is you will have to take this into account when building the control limits – Wheeler and Chambers have the details.

    1
    #237270

    Robert Butler
    Participant

    DOE is a method for gathering data. Regression (any kind of regression) is a way to analyze that data. As a result it is not a matter of using DOE or regression.

    Experimental designs often have ordinal or yes/no variables as part of the X matrix. The main issue with a design is that you need to be able to control the variable levels in the experimental combinations that make up the design. In your case you said you have the following:

    A. $
    B. Name
    C. Social proof (others have said the same thing but here’s what they found…)
    D. Regular script

    Before anyone could offer anything with respect to the possibility of including these variables in a design you will need to explain what they are. You will also need to explain the scheduling rate measure – is this a patient scheduled yes/no or is it something else.

    However, in the interim, if we assume all of these variables are just yes/no responses then you could code them as -1/1 (no/yes). The problem is that you cannot build patients. Rather you have to take patients as they come into your
    hospital(?). What this means is that you have no way of randomizing patients to the combinations of these factors and this, in turn, means you have no way of knowing what variables, external to the study, are confounded with the 4 variables of interest which means you really won’t know if the significant variables in the final model are really those variables or a mix of unknown lurking variables that are confounded with your 4 chosen model terms.

    What you can do is the following: Set up your 8 point design and see if you have enough patients to populate each of the combinations. Given that this is the case, you can go ahead and build a model and, if the response is scheduled yes/no, the method of choice for data analysis would be logistic regression. The block of data you use will guarantee, for that block of material, your 4 variables of interest are sufficiently independent of one another. What this approach WILL NOT AND CANNOT GUARANTEE, is that the 4 variables of interest actually reflect the effect of just those 4 variables and are not an expression of a correlation with a variety of lurking and unknown variables.

    Another thing to remember is the coefficients for the variables in the final reduced logistic model are odds ratios which are not the same as coefficients in a regular regression model and cannot be viewed in the same way.

    If you don’t have enough patients to populate the combinations of an 8 point design you will have to take the block of data you do have and run regression diagnostics on the data to see just how many of the 4 variables of interest exhibit enough independence from one another to permit their inclusion in a multivariable model. All of the caveats listed above will still apply.

    What all of this will buy you is an assessment of the importance of the 4 variables to scheduling with the caveat that you cannot assume the described relationships mirror the relation between the chosen X variables and the response to the exclusion of all other possible X variables that could also impact the response. While not as definitive as a real DOE this approach will allow you to do more than build a series of univariable models for each X and the measured response.

    This entire process is not trivial and if, as I suspect, you are working in a hospital environment I would strongly recommend you talk this over with a biostatistician who has an understanding of experimental design and the diagnostic methods of Variance Inflation Factors and eigenvalue/condition indices.

    0
    #237254

    Robert Butler
    Participant

    I would like to offer a few comments and make a few recommendations with respect to your initial post.

    Statistical methods, machine learning, advanced methods of data analytics and statistics are all tools one can/does use in the analysis phase of six sigma. In this light, the observation “Of course, these techniques would be used after the current standard methodologies.” does not make sense.

    At first I thought your statement in your second question “… experimental design and analysis to optimize waste and process variance to maximize business revenue / business expansion instead of minimizing defect rates?” Meant you were talking about waste and variance reduction in a process but then later you said, “…a business may have the desire to increase some of its process variation to improve overall revenue and shareholder support.” Assuming this was not a typo, the only business of which I’m aware that would view increased process variation as a good thing is the field of cryptography where increased variability means increased difficultly in code breaking – in every other instance the focus is on variance reduction.

    The overall impression I have of your initial post is you want to gain some basic understanding of analytical methods and their underpinnings. If this is the case then you might want to consider the following:

    1. With respect to connections between machine learning and statistics I think your best bet would be a Google search. I plugged those terms into their engine and a number of peer reviewed papers came up in the first pass. I can’t vouch for the quality/relevance to your question but it would be a start.

    2. For a basic understanding of statistical principles I’d recommend reading Gonick and Smith’s book The Cartoon Guide to Statistics.

    3. For a basic understanding of regression methods (I’m citing the 2nd edition) try Draper and Smith Applied Regression Analysis. For a first pass, read and understand Chapter 1, in particular the very well written description of just what linear regression is and the basic assumptions of the method. You could skip Chapter 2 because that is just Chapter 1 in matrix notation form and then read and memorize Chapter 3 – residual analysis. You might want to augment Chapter 3 with Shayle Searle’s August 1988 paper in The American Statistician concerning data that results in banded plots of residuals. You could follow this with a reading of Chatterjee and Price’s book Regression Analysis by Example.

    4. Experimental design is experimental design and it can be used to address changes in the mean, changes in the variance, or both. To the best of my knowledge there are no special requirements/restriction if one wishes to employ experimental design methods in conjunction with machine learning or data analytics – indeed I have built experimental designs to explicitly guide machine learning efforts.

    A good overview of design methods would be Schmidt and Launsby’s book Understanding Industrial Designed Experiments. It covers all of the types and provides a good discussion of when and where you might want to use the various forms. It also contains an excellent discussion of the Box-Meyers method for using the data from an experimental design to identify sources of process variation.

    5. Data quality: A book that is worth reading in its entirety, particularly in this age of big data, is Belsley, Kuh and Welsch’s book Regression Diagnostics – with the subtitle “Identifying Influential Data and Sources of Collinearity.” The book is not an easy read and you will need some knowledge of matrix algebra/symbol manipulation in order to understand what it is saying.

    The issues it discusses and evaluates are particularly important given the somewhat prevalent attitude which can be summarized as; if I have a big data set I have everything and I don’t need to concern myself with issues that are related to “small” data sets. The attitude has no foundation in anything save blissful ignorance – big data has the same problems as small data and, to keep yourself from going wrong with great assurance, you need to keep this in mind and know what questions to ask. A good recent book which discusses some of these issues in a non-technical manner is Weapons of Math Destruction by Cathy O’Neil. She is a mathematician who has worked with big data and I think her book is a balanced look at the promise and pitfalls of big data analysis.

    2
    #237252

    Robert Butler
    Participant

    @cseider I suppose it could be but if it is I would nominate the question for the scatological award for worst posed homework question of the year.

    0
    #237235

    Robert Butler
    Participant

    As asked your question has no answer because you have not provided any information about your production line. For example, if the process target is 1 ppm and my in control process 3 standard deviation limits around that target are plus/minus .01 ppm the answer would be – not a thing.

    0
    #237199

    Robert Butler
    Participant

    The standard calculation for Cpk requires normality in the data. Your data is naturally non-normal so what you need to do is use the methods for calculating Cpk for non-normal data. Chapter 8 of Bothe’s book Measuring Process Capability has the details of how to do this.

    The basic method consists of plotting the data on normal probability paper and identifying the .135 and 99.865 percentile values (Z = +-3). The difference between these two values is the span for producing the middle 99.73% of the process output. This is the equivalent 6 sigma spread. Use this value for computing the equivalent process capability.

    It is my understanding Minitab has included this method in its software. I’m not familiar with Minitab but I’m sure if you check the “help” section of the program you will find the necessary commands. If you can’t find it there you could contact Minitab help or, since this topic has been discussed time and again on this forum, you could use the search engine to locate older posts on this subject. As I recall, one or more of the older posts were made by people familiar with Minitab and they provided instructions on how to find this method in the Minitab package.

    0
    #237099

    Robert Butler
    Participant

    As I said in my first post – the problem is that of sample independence. If your batches consist of something like widgets stamped out on a production line then you would need to first take as many samples as you can at a uniform time interval for a single batch and run a time series analysis to determine the minimum separation in time between samples needed to declare that the samples are truly independent of one another and then take samples at that time interval or at some time interval greater than the minimum and look at the number of samples per batch this approach will provide and use those to compute your Cpk.

    If you can’t do this then you will have to take the time to understand the issues associated with control charting and calculation of Cpk in the presence of lack of independence of measurement (auto-correlation). As Wheeler and Chambers note in their book Understanding Statistical Process Control 2nd edition – there is a myth (see pp.80-81) that the data needs to be independent before it can be control charted. As they note- you can use auto-correlated data but there are issues and things you need to keep in mind.

    If the batch is something like a large reactor vessel then the multiple samples per batch will not be independent and auto-correlation is guaranteed.

    As for a Cpk with auto-correlated data you should borrow a copy of Measuring Process Capability by Bothe through inter-library loan and read pages 781-792 – the chapter subheading is 13.3 Processes With Auto-Correlated Measurements. In short, if the batch samples are not independent you will need to do some reading to gain an understanding of the issues before you try to build and interpret a control chart or calculate a Cpk with this kind of data. For both issues, control chart and Cpk, I would recommend the section in the Bothe book as a good starting point.

    0
    #237097

    Robert Butler
    Participant

    Ok, sounds good.

    0
    #237096

    Robert Butler
    Participant

    So, is it a total of 30 samples from 6 batches for a total of 5 samples per batch or is it a case of 30 samples from each of 6 batches for a total of 180 samples?

    0
    #236983

    Robert Butler
    Participant

    To your points
    – It’s not that I don’t recommend trying more than three rather it is the odds of finding raw material lots (a minimum of 6 if we go with the D-optimal design I mentioned earlier) with the right levels for each of the 6 combinations. If you have prior data on lots of raw materials it would still be worth examining the data to see if there were lots whose overall material properties would have met those six combinations. You might get lucky and you might find your supply chain is such that you could actually run an analysis including the five supplier variables.

    – The biggest problem with running a 2^(5-1) design on a lot of material from each of the three suppliers is there is no guarantee the lot is representative of the lot-to-lot variation one might expect from each supplier. Based on some rather bitter past experience, I think you would find the following:

    1. The final equations for each of the suppliers will differ with respect to significant in-house terms and/or significant differences in corresponding coefficients.
    2. When you attempt to test the developed equation for a given supplier using a new lot of material from that supplier you will find the predicted correct settings for optimizing the process (based on the old lot) do not apply and do not provide guidance with respect to optimizing.

    If you choose to try running the 2^(5-1) with a lot of material from each supplier try to do the following:

    1.Across suppliers try to pick the three lots so that two of the supplier variables of interest match the small matrix we talked about previously and let the other three fall where they may.

    2. Build the three models for each of the suppliers. Make sure you build each model using scaled levels (-1 to 1) for the in-house variables.
    a. Take the three models and examine them for term similarity and difference.
    b. Since the models are based on scaled parameters you can compare term coefficients across models directly. This will allow you to easily see not only term similarities and differences but also see what happens to the magnitude of the coefficients for similar terms in each of the models.

    3. Next pool the data from all of the experimental runs. Re-scale all of the in-house variables to a -1 to 1 range. If you were able to find three lots that allowed you to check the two supplier terms add those terms (again scaled to a -1 to 1 range) to the list of parameters you can consider and re-run the model.
    a. Take this model and compare it to the results of the three separate models again looking for similarities with respect to term inclusion and coefficient size.

    4. If you can’t find lots across suppliers that allow you to check some of the supplier variables then take the data from the three runs and assign an additional dummy variable to the data from each fractional factorial run (-1,0,1 will suffice).
    a. Build a model based on all of the data and including the dummy variable which will represent material lots/supplier differences.
    b. Examine this equation in the same manner as above – looking at similarities/differences in model terms between this overall equation and the three separate ones.
    c. If there seems to be “reasonable” agreement between this model and the three separate models (any term in the “big” model is present in at least one of the three separate models and the coefficients of those terms are of the same magnitude and direction) and IF the coefficient for the lots is such that any unit change in that variable can easily be overcome with combinations of changes in the other terms in the model then it would be worth testing the “big” model with a new lot from any one of the suppliers to see how well it can predict the outcome.
    d. If, on the other hand the “big” equation has a “number” of terms (sorry, I can’t be more precise than this) not found in the three separate equations and/or there are large differences in coefficient size or direction this would argue there is more to the issue of suppliers and lots than you expected. It would still be worth checking the equation based on all of the data just to make sure.
    e. The worst case is one in which the final equation based on all of the data has just a few in-house process terms with small coefficients and a term for the lot/supplier variable with a large coefficient whose size is such that no combination of the other variables in the final model could overcome lot changes. In this case about all you can say is that lot-to-lot variation is impacting your process in ways that cannot be overcome with the current group of in-house variables you are using for process control.

    0
    #236786

    Robert Butler
    Participant

    I’m a statistician – discussions like this are never a bother. :-) As to your plan – there is a problem with the way you have laid out the supplier information – specifically it does not match the example you provided earlier and which I used as a starting point for my post yesterday.

    In the earlier post you gave the following:

    supplier 1 670 Rm and 19% elongation
    supplier 2 640 Rm and 15% elongation
    supplier 3 650 Rm and 19% elongation

    …and I translated the above settings into the following design matrix for the two variables RM and Elongation
    Source…..RM…. Elongation
    Supplier 1 +1 and +1 ……..Thus the combination is (1,1)
    Supplier 2 -1 and -1 ……..Thus the combination is (-1,-1)
    Supplier 3 -1 and +1 ……..Thus the combination is (-1,1)

    What you have done now is two things

    1. You increased the number of supplier variables from 2 to 5.
    2. You are trying to use just one lot of material from each supplier to fill out the supplier part of the design matrix.

    If you choose to go with 5 supplier variables then, at a minimum and assuming two levels per variable (I just checked with a D-optimal run) you will need to have material from your various suppliers which meets the following conditions WITHIN a single lot of material from a supplier:
    Code: (Level Variable 1, Level Variable 2, Level Variable 3, Level Variable 4, Level Variable 5)
    Combination 1 (1,1,1,1,1)
    Combination 2 (-1, 1,-1,1,-1)
    Combination 3 (1, -1, 1, -1, -1)
    Combination 4 (-1, -1, 1, 1, 1)
    Combination 5 (-1,1,1,-1,-1)
    Combination 6 (1,1,-1,-1,1)

    As I mentioned in the last post – since you have to take what is sent you won’t hit the same low and the same high values for each variable across the combinations – the issue is to make sure that within a given variable the range of values you are calling “low” do not cross over into the range of values you call “high”.

    It also now appears you do not have a physical measure for roughness, homogeneity, and conductivity so you now have to treat the measures as a two level rating. This is fine but do not treat the ratings as categorical when you build the design. The ratings do have a rank ordering and, as long as your in house expert is confident he/she can discriminate between good and bad, you can treat those variables in the same way you would treat the other variables when building your design.

    The really big issue would be identifying lots of raw material from the various suppliers whose spec sheets in combination with your in house ratings, could populate the supplier design matrix I listed above. In my experience it is almost impossible to build a matrix for more than 3 variables at two levels given this method of variable level selection and control (actually lack of control). I’m not saying you can’t do this but you should review your prior spec sheets and your prior in house ratings for past lots of materials across all suppliers just to see if, in the historical data, you have a subset of measures that would have met the requirements of the matrix.

    As for telling the difference between suppliers my interpretation of your problem was that you wanted to be able to optimize the process regardless of supplier. If you are able to populate the supplier matrix with material across suppliers then the resulting model would describe the behavior of your process with respect to variables only. Thus, when you received a new lot of material from any supplier you could plug the corresponding values into your predictive equation and then ask the equation to identify your in house process parameter levels that would result in optimum product.

    0
    #236778

    Robert Butler
    Participant

    Addendum to the last post: One thing to keep in mind when your low’s and high’s have ranges of values is the “fuzziness” of the -1 and 1 settings can be an issue. If the highest low value and the lowest high value for a given variable are too close together you may find you will lose information with respect to some variable interactions. In your case this would most likely apply to any attempt to check the interactions of the supplier variables with one another or with the in house variables.

    The way you check to see if this is a problem requires scaling the actual low and high values to their actual negative and positive settings and not pretend that any low setting is automatically a -1 and any high setting is automatically a +1. The way you do this is to take the max and the min values for a given variable and scale the entire range of that variable from -1 to 1.

    To do this compute A = (Max value + min value)/2 and B = (Max value – min value)/2 and scale each setting according to the formula
    Scaled X = (Actual X – A)/B

    once you have the actual matrix of X’s expressed as levels ranging between -1 to 1, use this matrix to compute the scaled values for whatever terms you expect to be able to check with the design you have built and then check the resulting matrix to see if you get confounding.

    This check is performed by assigning a simple count variable to each experiment in the design (1 for the first experiment, 2 for the second experiment, etc.) Regress the counts against the model terms as expressed in scaled form and look at the VIF’s for each of the model terms. Those with a VIF > 10 are too highly confounded with other terms and cannot be estimated.

    An even better way to do this is to look at the eigenvalues/condition indices of each of the model terms but to the best of my knowledge this capability is not present in most computer packages.

    0
    #236768

    Robert Butler
    Participant

    Ok, let’s make that assumption. You have the following:

    supplier 1 670 Rm and 19% elongation
    supplier 2 640 Rm and 15% elongation
    supplier 3 650 Rm and 19% elongation

    Therefore what you have are the following combinations for the two factors Rm and elongation
    Supplier 1 +1 and +1
    Supplier 2 -1 and -1
    Supplier 3 -1 and +1

    The key here is to remember that it is rare, even when you have the capability to actually control the settings of a given variable, to be able to get back to the exact same setting for the “low” or the “high” value. As long as the range of values for some “low” setting do not overlap the range for some “high” setting you can view the values in the manner indicated.

    So what you now have to do is build a matrix of experiments that the D-optimal design can use for construction. For everything at two levels this would be the full 2**7 matrix (I’m saying 2**7 because, using the above example, you only have two supplier variables in addition to your 5 process variables).

    If we assume all you have are the three choices for the suppliers listed above you would instruct the D-optimal program to only consider those experimental combinations that have those three combinations for the variables Rm and elongation and have it build a design that will allow an examination of all 7 variables based on the restricted subset. Once you have the results of the design you would run the usual regression analysis and use backward elimination and forward selection with replacement (stepwise) regression methods to construct your reduced model (you would want to run the analysis both ways because D-Optimal designs are rarely perfectly orthogonal).

    Let’s assume the residual analysis indicates all is well with the final model and let’s assume the final model has both supplier terms and process terms. Given that model you can fix the levels of the supplier values at whatever you choose and then run optimization studies on the process variables. This analysis will tell you what you can expect given that you have no control over supplier variables.

    In light of the statement you made in your second post – “We are pretty sure that higher tensile strength and smaller roughness would enhance the process” I think you should be prepared for the possibility that your ability to influence product quality using only in house variables will be limited. However, if this turns out to be the case, because of your efforts you will have real proof that approaching the problem just by looking at in house variables is not enough.

    0
    #236736

    Robert Butler
    Participant

    Since each of the supplier lots are allowed to vary within a range of (separate?) values for tensile strength, elongation, and roughness and since you only want to run an analysis where you are controlling your in house process parameters the best thing to do would be to review your collection of spec sheets from prior lots of raw material and summarize the reported property values. This will give you a picture of the distributions of tensile strength, elongation, and roughness you have accepted in the past. Once you know these distributions you should look over the spec sheets of those material lots which you do have on hand to see where their properties fall relative to prior history.

    If, collectively, the current raw material lots have levels of the three properties which span say 95% of the range of each property of interest you could take that data, ignore the lot source, and attach a complete design matrix for the in house properties to each of the lots and use this matrix of design points as a basis for identifying runs in a D-Optimal design. (If the current supply of raw material does not meet this requirement you will have to put aside some portion of the current lots and wait for your suppliers to send you material that, when combined with your saved samples, does cover an acceptable range of the three variables of interest).

    What you will have will be a design that will allow you to check for the impact of your in house variables as well as the impact of the material property variables. What you will get from the analysis will be a predictive equation involving both the in house variables as well as the material property variables. The point of the predictive equation is NOT to try to control material properties, rather it is to give you and understanding of how well your in house variables can be expected to control your process given that you are NOT controlling material lot properties.

    What you would do with this predictive equation is take the prior measurements from the old spec sheets for material you have already used, set you target and tolerance for the measured response and then use the equation to identify settings for each of your in house variables that would result in a within tolerance predicted response. The matrix of values for the in house variables used in the equation would be based on the range of settings you know you can run for each in house variable.

    The matrix of predicted responses will tell you what you need to know with respect to your ability to control product properties given that lot properties will vary and how much control you can or cannot expect given the properties of the incoming raw material.

    Based on your description I don’t think a basic design run on a lot of material from each of the three suppliers would be of much value. If the available lots from these three suppliers just happen to have about the same values with respect to tensile strength, elongation, and roughness then all you will have done is essentially replicate a saturated design 3 times.

    As for treating the suppliers as a noise variable – you can’t. If you are going to try to run a Taguchi design what you need to remember is that a noise variable is a variable you would rather not have to control when running your process. However, when running the design you HAVE to control the variable you have designated as a noise variable because you have to be able to set it to high and low levels in order to meet the design requirements. Based on what you have said, you do not have this level of control.

    0
    #236719

    Robert Butler
    Participant

    I think we need to back up a bit. If I understand your most recent post correctly you get a spec sheet from a supplier which provides tensile strength, elongation, and roughness measurements on each shipped lot. If this is the case then this would argue that you have zero control with respect to independently varying these three measures – for any given lot they are what they are.

    If this is the case and if you want to check your hypothesis that higher tensile strength and smaller roughness would improve the process then, you will have to lot select for four different combinations of tensile strength and roughness. Assuming you can do this what you will need to do is use the methods of computer generated designs (D-Optimal is one that is readily available) to build a design to test this. Of course, if you want to toss elongation into the mix then you would have to lot select on this variable as well to come up with the four combinations of the three variables (-1, -1, 1), (1,-1,-1), (-1,1,-1) and (1,1,1). Given you can do this you would build a full 2**8 matrix, restrict the combinations for tensile strength, elongation, and roughness to the 4 combination and use D-optimal to find the minimum number of experiments necessary to test all 8 items.

    The above is, of course, based on a lot of assumptions. We are assuming the spec sheet really reflects measurements on the provided lot of material and isn’t some generic Good Housekeeping Seal of Approval. We are assuming you have the ability to do lot selection/retention. We are assuming that, regardless of what the manufacturer is sending you, it is possible for tensile strength, elongation and roughness to vary independently of one another.

    If you could provide more detail on the above assumptions we’ll see if we can come up with some suggested courses of action.

    0
    #236698

    Robert Butler
    Participant

    Since you are concerned with the ranges of incoming material properties in addition to the things you vary in your process a better approach would be to run a saturated design on the 8 variables

    Tensile Strength: from 600 to 800 N/mm^2
    Elongation: from 20% to 10%
    Roughness: from 2 to 0,5 µm
    Energy Flow (Ampere)
    Upsetting speed (mm/secs)
    Clamping pressure (Bars)
    Anvil speed (mm/secs)
    Upsetting delay (secs)

    You would select random material from all three suppliers to fill in the low and high values for tensile strength, elongation, and roughness and identify your high and low settings for your in house process parameters in the usual way. If you have overlap with respect to supplier raw material quality you could run a couple of replicate points using alternate material. You could run a main effects check in 18 experiments if you just wanted to run a semi-saturated 2 level factorial design or, if you have the computing firepower, you could plug your data matrix into a D-optimal design generator and probably identify a design of about 12 runs.

    0
    #236697

    Robert Butler
    Participant

    You will have to provide more detail before anyone can offer much in the way of advice.
    Questions:
    1. What kinds of questions are on the survey? Are they yes/no, or are they Likert scale (ratings 1-5, 1-10), or are they multiple choice or are they open ended please-add-a-comment kind or is the questionnaire a combination of all of the above?
    2. Regardless of the questions – how have you vetted the questions to make sure you are not biasing the response?
    3. What do you plan to do with the results?
    4. If you have please-add-a-comment type questions – how are you going to approach their analysis?

    If you can answer these questions perhaps I or some other individual may be able to offer suggestions.

    0
    #236678

    Robert Butler
    Participant

    In order to get an answer to your question you will have to read the detailed description of the derivation of the requirements for that particular USP protocol. Somewhere in the description will be statements concerning minimum detectable differences and power along with other things and it will be these assumptions/requirements that result in the minimum requirement of 10 samples.

    0
    #236630

    Robert Butler
    Participant

    If your process is expected to exhibit large variation then you should ask yourself why it is that you are seeing variation that is much lower than you thought it would be. Some things you might want to check:

    1. Where did you get the idea that the process variation is large? Does someone actually have prior data that exhibits large variation or is this just someone’s impression based on who knows what? Also, if prior data does exist – how do you know it is representative of the process?

    2. For the XbarR chart – are the samples actually independent? If you are taking repeated samples from the same lot of production material then there is a very good chance that the samples are repeated measures and not independent. Estimates of variation based on repeated measures will be much less than estimates based on independent samples.

    If your process really is as tight as your initial calculations would indicate then you are in the very fortunate situation of being able to meet customer requirements 100% of the time.

    0
    #236396

    Robert Butler
    Participant

    I guess I don’t understand why you are bothering with Z scores. To rephrase your question – I have an estimated average travel time of 39 minutes to get to work with a sample standard deviation of 7 minutes.

    The question that is being asked is: what kind of reduction in variation of individual travel time must an employee make in order to meet (the absolutely absurd) management target of a maximum of 40 minutes travel time.

    The basic equation to consider is management time = meantime +-(2*std/sqrt(n))

    Therefore, for one person n = 1. If all employees have a time of 39 minutes then to have 40 minutes travel time the standard deviation would have to drop from 7 minutes to 30 seconds.

    Now, 39 is just the mean of the population. The 95% CI for that mean – assuming symmetry with respect to the distribution of arrival time – is going to be 39 minutes +-2*7. Thus, at the extremes you have someone arriving at work after 25 minutes of travel and someone arriving after 53 minutes of travel.

    The poor soul who needs 53 minutes to get to work can reduce the variation in his travel time to 0 and he will still need a time machine at the entrance to the workplace to take him back in time 13 minutes in order to meet the target set by management.

    As for the gal who needs only 25 minutes to get to work, she can stop off for a coffee, do a little window shopping, and still show up at the gate with an elapsed travel time of less than 40 minutes.

    In short, the management target is based on wishful thinking and has no connection to what is happening in the real world.

    1
    #236391

    Robert Butler
    Participant

    That has to be one of the worst homework problem statements I think I’ve ever seen. As written there are all kinds of answers and solutions to the problem.

    1. If the reported standard deviation is the standard deviation of the mean and not just the sample standard deviation then there isn’t much of anything you can do with the statement concerning personal vehicles.
    2. If the standard deviations are the sample standard deviations then we are assuming that somehow employees taking public transportation have some kind of control over how that mode of transportation functions.

    If we assume the standard deviations are sample standard deviations and we assume the employees have total control of traffic conditions and how public transportation functions then I guess you have the makings of a homework problem one could address.

    Questions for you:

    1. What is the basic expression for a confidence limit around a mean?

    2. If we pretend the standard deviations are sample standard deviations and we pretend the average represents the central tendency of a reasonably symmetric time distribution then how would you determine the range of actual arrival times (95%) around the mean?

    3. As the question is written, what is the sample size we need to use to discuss the results of an individual employee?

    4. Given that you have identified the 95% limits for the mean and that you have identified the sample size associated with an individual you should be able to calculate the ranges of variation associated with various employees. I will give you one hint – some of the employees will have to employ time travel in order to meet the requirement.

    2
    #236293

    Robert Butler
    Participant

    If by a batch you mean a single run such as contents of one reactor vessel then the answer is neither because those samples are repeated measures – not independent measures. If you try to build any control chart with that kind of data the control limits will be too narrow because the sample-to-sample variability within a single batch is going to be less (and possibly much less) than batch-to-batch variability.

    As for the second question concerning 30 results from 30 batches – the question you need to answer is this: Is there any obvious subgrouping of the measures? If there isn’t then forcing what amounts to single measures into some artificial grouping is of little value. In all of the cases I ever dealt with where the output was a batch of material I used IMR charts and never had a problem.

    0
    #236212

    Robert Butler
    Participant

    Which one is better depends on many things. If it is preliminary and you have no prior information concerning a possible curvilinear response over the range of variables you are examining then it comes down to a matter of time/money/effort. If we are discussing the example of your other post – 2 variables, then you would be running 6 vs 10 experiments – 2**2 + two center points vs 3**2 and a single replicate of one of the design points.

    One compromise is to run the 2**2 plus the two center points and, if the analysis indicates curvlinear behavior, augment the existing run of experiments with a couple of additional design points to test the curvature with respect to the two variables of interest. To do this you would need to use one of the computer generated design methods such as D-Optimal. You would force in the existing design and allow the program to choose additional runs from all of the possible design points in a 3**2 design.

    0
    #236185

    Robert Butler
    Participant

    Addendum: With respect to the residual plot. I left out a piece – You want to look at the overall pattern of the residuals – not just where the center points fall. If the model is adequate you should see a random distribution of data points above and below the 0 residual line. As part of that assessment you will want to see how the center points behave.

    0
    #236184

    Robert Butler
    Participant

    QUESTION 1: is this already okay?

    Yes, except as written your point #2 sounds like you are just going to look at a model of X1 and X2 and not bother with including the interaction. The full model would be response = fn(X1, X2, X1*X2). What you want to do is run backward elimination on the full model to see what terms remain in the reduced model (the model containing only significant terms).

    QUESTION 2: Do i still need to validate my model using RMSE or R^2 method?

    I’m not sure what you are suggesting here. You test for model adequacy by looking at a plot of the residuals vs. predicted. If the residuals of the replicates on the center points are split by the 0 line then it would strongly indicate that, over the region you have examined, the relationship between the Y and the X’s is linear (straight line). If the two center point residuals plot either above or below the 0 line then it would suggest there is a curvilinear aspect which would mean you should consider augmenting what you have done with a couple additional runs to identify the variable(s) driving the curvature.

    As for model validation – given the residual analysis says everything is OK then you will need to use the regression equation to identify settings of X1 and X2 that will predict the best Y and go out and run that experimental combination and see what you see. If the resulting Y is within the prediction limits of the model then you will have your model validation. If it isn’t then you will have some more work to do.

    As for using RMSE or R^2 to assess the model – don’t. R^2 is an easily fooled statistic and RMSE by itself doesn’t tell you much either.

    There are any number of posts, blogs, even articles that tell you that you must have an R^2 of X amount before the model is of any use. The often cited value for X is 70% or greater. By itself this statement is of no value. If your process is naturally noisy then the odds of your being able to build a model that explains 70% of the variation is very low. Rather than waste time chasing some level of R^2, just test the equation using the method mentioned above. If the results are as indicated above then you have a model that will address your needs with respect to predicting the behavior of your process over the ranges of the X1 and X2 you used.

    As a counter to the R^2 must be such-and-such before the model is useful claim I’ll offer the following: A number of years ago the company I worked for made additives which were used in the hand lotion industry. One of our customers asked if we could run a series of experiments that would give them some level of predictive capability with respect to outcomes. We signed a non-disclosure agreement, they paid for the research and, after a lot of discussion, I built a design that I thought would address their concerns.

    Hand lotions are evaluated by a team of people who have been trained on control compounds to assess various characteristics of lotion. The ratings for the various measures are ordinal and, depending on the characteristic, the rating scales can be either 1-5, 1-7 or 1-10. There were about 10 different characteristics of interest. After they had built the compounds corresponding to each of the experimental runs and got the responses from their evaluation teams, I built separate models for each of the characteristics and then used all of the models together to identify the best trade offs with respect to product properties. None of the models had and R^2 greater than 30% and, because of the ordinal nature of the responses, the prediction errors were rounded to the nearest .5.

    We generated a series of predictions of product properties based on the regression models and the manufacturer went out and built the formulations that matched the predicted combinations. When they gave the confirming runs to their panels for evaluate – every one of the responses were well within the prediction limits associated with each predicted response. The company found the models to be extremely useful and they used them to guide research into ways to further improve their product.

    0
    #236158

    Robert Butler
    Participant

    The issue is this – the basic philosophy of experimental design is if you are going to see a difference in response when you change conditions your best chance of seeing that difference is by comparing the results for the extremes of the variables of interest. This is the point of two level factorial designs. Thus, if you take the extremes of the settings of your two variables and run the 4 point experimental design and replicate just one of those experiments (total 5 experiments) you can build a single model which will include your main effects and the interaction of the two terms.

    You could custom build a 4 level factorial design but the big question is why would you want to do this? A 4 level design means you believe you have responses that would map to a cubic polynomial. Unless you happen to have, within the design space, something like a phase change I can’t think of any measured response of any kind of variable that would plot as a cubic function (and if you do have a phase change then you really need to go back and revisit a number of process issues before considering a design).

    If you believe you have a response that is curvilinear (i.e. not a simple straight line) then you could run a three level design for the two variables of interest. This would give you a total of 9 experiments and would allow you to look at all main effects , all squared terms, and the interaction of the two variables. If the cost per experiment is low and there aren’t any issues with time constraints you could opt for a 9 point run.

    If time/money/effort is an issue then a better approach would be to run the two level factorial design along with a center point (you haven’t mentioned the kinds of variables you are considering but we are assuming they are continuous in some fashion – either interval or ordinal) and then run a replicate of that center point. This would allow you to build a model with the main effects and the interaction and it would allow you to check for curvilinear behavior. If curvature is present you won’t be able to determine which of the two variables are generating this response but you will at least know it exists and you can augment your existing work with a couple of other design points to parse out the variable that is generating this curvilinear behavior.

    2
    #236004

    Robert Butler
    Participant

    I wouldn’t say “People in charge of hiring rarely understand the steps to getting a PhD and typically lump academics into a group of elitists “sitting in a corner studying to take a test”. Rather, it is a matter of how you present the fact of earning your degree to them.

    For example, my first two pieces of paper are in physics. When I went for my first job interview I knew at that time the perception of physicists fell into one of two categories – we were either head-in-the-clouds Albert Einstein wannabes or we just wanted to build bigger and better hydrogen bombs. I did my graduate work in vacuum physics which has all kinds of practical applications but I was aware few people knew this or had any idea vacuum physics is/was very experimental/applied. So, when I went to my first interview I took along one of the vacuum couplings I had made for my apparatus. The coupling required extensive precision machining/thread cutting in different metals as well as knurling on the screw caps.

    Within 2 minutes of the start of the interview the interviewer said, “Well, physics….that’s mostly theoretical isn’t it?” Without another word I pulled out the vacuum coupling, set it on the desk in front of him and said, “I had to build that, and other things just like it, for my research.” The interviewer turned out to be a mechanical engineer. He grabbed the coupling and started turning the caps, feeling for smoothness of thread, firmness of the seal, etc. He looked at me and said, “Well, but you bought the knurled caps right?” I said, ” No, I built the whole thing.” He returned his attention to the coupling and focused so intently on the feel and look of that piece of equipment that I know I could have said something like, “You know, I understand your mother is an unmarried orangutan. ” and I’m sure he would have just nodded his head in agreement. :-)

    From that moment forward I owned that interview. I got a job offer less than a week later.

    Based on your post it sounds like you would want to focus the interviewer attention on some of the green energy projects with Shell or perhaps the projects with the concrete innovation company. The focus would be – what was the problem, how did I solve it, was the solution implemented, how much was it worth – dollars saved, new dollars to the bottom line, process made better, cheaper, faster, etc.

    As a statistician in industry I had to provide a yearly summary of my efforts to management – basically justify my cost/value to the company. To that end, towards the end of the year I would go back to the various engineers I had worked for and ask them for an evaluation of what I had done for them with particular emphasis on dollars saved, new dollars to the bottom line, process made better, cheaper, faster, etc. If it was a failure or if what I had done had not been implemented I would record that and include these things in my summary to management as well. On a typical year I could point to about 3 million dollars to the bottom line that my engineers said I had contributed. Because these estimates came from the people I had worked for, management never questioned the summaries.

    If you haven’t done so already and if you still have contact with the people for whom you did your work you should ask them for this kind of feedback and make sure they understand you want the unvarnished truth and why you want it. Given that you have this information choose a couple of the more impressive projects and make them the body of the text of your job application. In my experience a detailed focus on a few projects along with things like actual value, goes much further with HR types looking over a resume as opposed to just a long list of I-worked-on-this-and-then-I-worked-on-that.

    1
    #236001

    Robert Butler
    Participant

    I agree with @mike-carnell as far as the “Wow” factor is concerned – in my experience there isn’t any. I’ve worked in industry and medicine as a physicist, engineering statistician, and biostatistician (oh yes,and as a black belt) for almost 40 years. Except for including the facts and dates of my three degrees on the initial job application form (four if you count the BB cert), no one has ever once asked about the number of degrees, the levels, or the types.

    In every case it is always just “Can you do this? When can I expect to see what you have done? When will you present your findings?” …and “Please remember when you write your final report that you will need to make sure everyone understands what it is that you have done for them.”

    In a similar vein, during the job interviews the questions are never about the degrees attained. Rather they are of the form: “Given the situation you have just read – how would you approach it?” (presenting you with a written what-if scenario, asking you to quickly read it, and then respond to a battery of questions from your interviewers is common). On those occasions where, in addition to a battery of what-if scenarios, I’m asked to make a presentation the presentation is in the form of – present the problem, highlight major issues, briefly describe the approach, and summarized the final results – all in a form that an audience consisting of everyone from people on the line up to and including the VP in charge will be able to understand.

    4
    #235999

    Robert Butler
    Participant

    As written your question is meaningless. You will need to give some idea of what it is that you are trying to do. For example, if you are running an analysis of some kind and the outcome is productive yes/no where you have an operational definition of “productive” and you are looking for possible relationships between measured inputs and the yes/no outcome then it is a binomial situation and it has nothing to do with a normal distribution. On the other hand, if it is something else then the answer could be yes/no/maybe.

    1
    #235835

    Robert Butler
    Participant

    Never mind – it always helps to correctly read your own calculator screen – the sum is .004800 not .004880. When you take the square root of .004800 and divide .04 by that you get .577 which rounds up to .58

    0
    #235834

    Robert Butler
    Participant

    Hmmmmmm…there must be some kind of double precision inside the calculator. If I first sum the two values and take the square root of the sum and do a division of that number into .04 I get the numbers indicated. However, if I just input .00488 and take the square root of that number I get .069857 which when divided into .04 gives .572 which doesn’t round up to .58.

    0
    #235833

    Robert Butler
    Participant

    .002112 + .002688 = .0048800 and the square root of that is .069282. Divide .04 by that and you get .577350 which when rounded up to the level of precision allowed in your calculation is .58

    0
    #235479

    Robert Butler
    Participant

    Just tallying how many times some X variable shows up in a series of regression equations (one for each Y response) isn’t going to address the importance of that X. If we use your example Y1 = X2*X1^2 and Y2 = X2*X1*Y1^2 then, in order for what you are saying to be true, the coefficients for the X’s in both of the models for Y1 and Y2 would have to be equivalent in magnitude – that is a unit change in X1 in Y1 would have the same effect as a unit change in X1 in Y2.

    There is nothing that says this would be the case. Indeed, it is precisely because this is more often than not the situation that one can use the simultaneous equations for Y1 and Y2 to identify regions of “optimum” performance where “optimum” means the best trade off’s with respect to Y1 and Y2 given that both are dependent, to some degree, on the same variables.

    I wouldn’t recommend trying to build a weighting scheme for your X’s. If you do have a situation where the number of X’s is such that you can’t build a screen design to check them then I would just bean count the number of times you think a particular variable is going to matter to the Y’s of interest, pareto chart the results, and go with the top N where N is the number of variables you can comfortably run in a screen design.

    1
    #211433

    Robert Butler
    Participant

    Value Stream Mapping (VSM) is a way to track a process from beginning to end with an eye towards examining the system to look for places for improvement within the process. Your description of what you plan to do isn’t VSM. Indeed, based on your post I’m not sure what it is that you are trying to do.

    My understanding of your post is as follows:

    1. You have two products A and B.
    2. Each product is produced twice each month.
    3. You are proposing to gather work orders corresponding to each of the products.
    4. You are proposing to measure the “amount of raw material” coming from the supplier each month
    5. You are going to take some kind of data from each of the work orders and analyze the numbers in some fashion.
    6. You are planning to look at the averages of these numbers from the work orders and draw conclusions about something.

    As I said this isn’t VSM. I’m not trying to be mean spirited or snarky but if the above is a correct summary of what you are proposing to do I don’t see what you plan to gain from any of this.

    1. You want to measure amounts of raw material from suppliers but give no indication as to why you are doing this. In your description of proposed actions this measure does not seem to have any purpose.
    2. What is it about A and B you want to know?
    3. If we assume taking some averages of data from work orders is meaningful to some aspect of A and B then what are you going to do with these averages?
    a. Compare averages of various work order data from A against that of B?
    b. Is this a situation where data recorded in work orders is known to reflect actual process behavior?
    1. if it is, then how will this relate back to assessing A and B?
    c. As an aside, it should be noted that if you are planning on comparing averages for any aspect of your process you will have to employ statistical methods for means comparisons.

    If you could provide more detail concerning what it is that you are trying to do and address some of the questions I’ve asked then either I or some other poster may be able to offer suggestions.

    0
    #211432

    Robert Butler
    Participant

    I don’t understand what you mean by “setup in Minitab 18.” The six point design is all there is and it would just be a matter of building the appropriate spreadsheet for the X’s and Y responses. If you go with just the six points you can build a model where Y is a function of A, B, AB, and B*B. If you toss in a single replicate of one of the design points you could also check A*B*B.

    You would reduce the model to just the significant terms in the usual way – first plot the data and see what you see, run backward elimination and forward selection with replacement regression and check the final model using residual analysis. If all is well then you would confirm the validity of the final reduced model by using it to make predictions with certain combinations of A and B, run those combinations, and see if your final result was inside the prediction error limits associated with your prediction.

    If you have more than one Y you would follow the above procedure for each of them and then you would use all of the equations together to predict a couple of optimum conditions (these will most likely be trade offs – most of the time you can’t get all of the Y responses simultaneously at their optimum level for a given level of A and B) and you would test the predictions in the same manner as above.

    0
    #211391

    Robert Butler
    Participant

    Thanks Mike

    1
    #211389

    Robert Butler
    Participant

    Thanks Katie(?), for correcting the design display on my first post. You can remove the second one if you wish.

    0
    #211379

    Robert Butler
    Participant

    The other option is to not bother with any kind of transformation. Since you know the distribution is non-normal for data when the process is in control just use the usual methods for calculating Cpk for non-normal data. It is my understanding that Minitab has this method as part of the program. The method is described in detail in Chapter 8 of Bothe’s book Measuring Process Capability. The chapter title is “Measuring Capability for Non-Normal Variable Data.” Personally, when having to generate a Cpk for non-normal data, it is my method of choice.

    0
    #211378

    Robert Butler
    Participant

    Well, that’s just peachy. What you type isn’t what gets posted. Copy the design with headings into a word document and re-insert the appropriate blanks – you should have 3 columns, experimental number, levels for factor A and levels for factor B.

    0
    #211377

    Robert Butler
    Participant

    Sure, it’s a 2 x 3 design

    Experiment Factor A Factor B
    1 -1 -1
    2 1 -1
    3 -1 0
    4 1 0
    5 -1 1
    6 1 1

    1
    #210206

    Robert Butler
    Participant

    Your question sounds like a homework problem. I’ll assume it isn’t and offer the following: If you are doing something for the first time and, based on what you know, you believe what you are doing is going to make a difference then, if it does make a difference it is all value added and if it doesn’t make a difference it is still all value added. In the first instance you will have found a way to do something better/cheaper/faster and in the second instance you will have demonstrated to your satisfaction that approaching the issue from that point will not work. Knowing that something will not work is just as valuable as knowing something does work.

    1
    #210202

    Robert Butler
    Participant

    It sounds like you have a process with a physical lower bound which renders negative values meaningless. In that case you will need to log the data, construct the control limits in log units and then back transform the relevant statistics (target mean and standard deviation). The control limits will, of necessity be asymmetric but they will not go below 0.

    0
    #209518

    Robert Butler
    Participant

    There also seems to be something wrong with the front page. At first you could see the listing of forum topics and other items below the header now there is just a blank space. I thought it might be a browser issue so I tried Google – same result. I can still see the sidebar listing of recent topics and I can get to the full list through that but all of that blank space on the front page has to be some kind of problem on your end.

    0
    #209079

    Robert Butler
    Participant

    Well, yes/no/maybe. The issue with the simple Pearson coefficient is that you can get “large” values and still have lack of significance. As for Rsquared -no, it is an easily manipulated/fooled statistic and, by itself, tells you very little. There are websites, blogs, even some papers and textbooks that will offer various “rules-of-thumb” with respect to assessing a regression effort on the basis of R2. Usually the rule-of-thumb is that R2 must be some value (.7 is a favorite) or higher before the regression equation is of any value – this is pure rubbish. The question isn’t one of R2, the question is what is the reason for building the regression equation?

    One quick example with respect to the absurdity of insisting on given levels of R2: Many years ago I had to analyze the presence of compounds in a water supply which, when present, indicated cyanide leaching. We ran the analysis over time and found a significant positive trend in the data. The R2 for the regression was .04. The crucial fact of the analysis was that the measured concentration of these compounds were increasing over time and, based on that regression, we could show that the trend predicted the cyanide level in the water would reach a life threatening level in about 3+ years time. The analysis wound up as part of a court battle between the local government and the firm responsible for the problem. I wasn’t present at the hearing and the firm dragged in some clown who had citations with respect to the “need” for high R2 before one could accept the results of a regression. End result – the firm won and about 4 years later the level of cyanide leaching reached the critical level. The cost of addressing the problem 4 years later was a lot more than it would have been at the time of the analysis

    If you are going to assess a correlation based on a regression then the only proper way to assess the success of that correlation is to do a regression ANALYSIS which means doing all of the plotting and examining of residuals that are at the heart of a regression ANALYSIS. I would recommend you find a good book on regression analysis and memorize the chapter(s) concerning the correct methods for assessing the value of a regression equation. Whatever you do, don’t make the mistake of judging a regression using only summary statistics like Rsquared.

    0
    #208930

    Robert Butler
    Participant

    Pearson’s correlation coefficient is only useful in the situation where the relationship between two variables is a straight line. If you have a curvilinear relationship between an X and a Y you can still run it by a Pearson’s correlation coefficient and you will find that correlation coefficient is low because the relationship is no longer a straight line. In a similar manner, if you try to fit a simple Y = X regression line to a curvilinear relationship you will find the p-value is most likely not significant. On the other hand, if you fit a Y = fn of X and X*X then you will have a significant P-value and, depending on the kind of curvilinear relationship you might have a significant p-value for both the linear and the quadratic term in the regression.
    Just to check this out try the following simple data set
    x1 y1 y2

    1 1 1
    2 3 2.1
    3 5 2.9
    4 5.5 4.2
    5 6 4.9
    6 5.8 6.2
    7 5.1 6.8
    8 4.9 8
    9 4.0 9.2

    You will find your Pearson’s correlation coefficient for y1 vs x1 is not significant with a value of .519 and a p = .152. If you run a simple regression of y1 = x1 you will get the same p-value. If you run y1 = fn(x and x*x) you will get p-values of <.0001 for both terms. If you run y2 vs x1 you will get a significant Pearson’s correlation coefficient of .998 and p <.0001 and a similar p-value for the regression of y2 vs x1.

    By the way the term linear regression only refers to the linearity of the coefficients in the regression equation. Linear regressions can have polynomial terms. Non-linear refers to non-linear in the regression coefficients and is a very different animal.

    0
    #203154

    Robert Butler
    Participant

    I spent some time checking various reference books looking for something that might help and I’m afraid I don’t have much to offer.

    Two thoughts
    1. We are assuming you are running ANOVA on continuous variables. If you are running categorical variables then linear and interaction are all you can expect to compute.
    2. The level of sophistication of modern statistics packages is such that I doubt you will find anything relating to manual calculations in any of the modern statistical texts for anything other than main effects and interactions.

    I did find a reasonable discussion of the problem and what looks like a guide for generating the necessary equations in Statistical Analysis in Chemistry and the Chemical Industry by Bennett and Franklin – Wiley – 5th printing January 1967 under the chapter sub-heading “Analysis of Variance for Polynomial Regression.” The text and discussion are too long to attempt to summarize on a forum of this type but you should be able to get a copy of this book through inter-library loan.

    If you can’t get this book then you might want to do a search on the web for the same topic – Analysis of Variance for Polynomial Regression – and see if anyone has provided the details.

    0
    #203122

    Robert Butler
    Participant

    Then is it fair to say that it makes sense to characterize the workpiece parameter with high and low values? Which is to say – are lambda, theta, and rho connected or can they be varied independently? If it is the former then you have 4 continuous variables which means you could run a screen check in 8 points with two center points for a total of 10 experiments. If it is the latter then you actually have the situation of a 6 parameter study – lambda, theta, rho, voltage, capacitance, and machine speed. In this case you could also run a screen design in 8 experiments again with two center points for a total of 10 experiments.

    An analysis of the 10 points would allow a check for significance of the factors of interest and pave the way for a design with fewer factors and the inclusion of some possible interaction terms. The analysis of the 10 point designs would also provide an initial predictive equation which could be tested to see if the observed correlations actually described possible causation.

    0
    #203118

    Robert Butler
    Participant

    No, that’s called cheating. We are assuming your upper spec limit is there for a reason. If it isn’t arbitrary then, given your description of events, you are just trying to pretend that some kind of special cause which is driving your process to the upper spec limit (and presumably at some future date will drive your process past that limit) doesn’t matter. The only solution your approach provides is a way to ignore problems with your process for the time being – this is not process improvement.

    0
    #203113

    Robert Butler
    Participant

    What you are asking is this: is there a way to solve my problem before I know what the problem is – and the answer is “no”. The whole point of DMAIC is that of problem identification, investigation, and resolution.

    0
    #203112

    Robert Butler
    Participant

    If this isn’t a homework problem it certainly sounds like one. In either case no one can offer anything because you have not provided enough information. You need to tell us what it is that you are trying to do. For example – confidence interval around the defect size, detection of a shift in defect size due to process change, etc. …and then there is the issue of precision – alpha and beta levels and why you need those at the level you have chosen.

    0
    #203110

    Robert Butler
    Participant

    What kind of a measure is “the workpiece”? My guess would be categorical unless you have some way of quantifying this variable (really small, small, medium, large, very large), (really simple, simple, average, above average, complex) etc. If it is categorical then the smallest design (we are assuming voltage, capacitance, and machining speed can be treated as continuous variables) would be a basic 3 variable saturated design (3 factors in 4 experiments) run for each of the 5 workpiece categories for a total of 20 experiments – add one replicate per group and you would have a total of 25 experiments.

    If, on the other hand, workpiece can be quantified then you could run a 2 level factorial with a couple of replicates at the center point for a total of 18 experiments.

    The thing to bear in mind here is that the philosophy of experimental design is that if something is going to occur it is most likely to occur at the extremes so rather than 5 levels for a continuous variable choose the minimum and maximum settings and use variations of a two level factorial design. Of course, if the one variable is categorical then you would run the smaller two level saturated design at each of the 5 categorical values.

    If you are worried about the possibility of curvature then you would want to include a center point on the factorial and run your replication there. For the case of the saturated designs with the categorical variable this would bring the total number of experiments to 30 and, if all can be treated as continuous then, as previously noted, the total would be 18.

    0
    #203105

    Robert Butler
    Participant

    Practically anything you care to name that has anything to do with any aspect of the way the process functions – is this a homework question or are you actually looking at a process. If you are actually working on a process tell us about it and perhaps I or someone else could offer more specific details.

    0
    #203098

    Robert Butler
    Participant

    Special cause can impact either the mean or the variability or both and changes in operators and equipment are not limited to changes in mean – these variables (along with lots of other things) can impact variation as well.

    0
    #203090

    Robert Butler
    Participant

    Welcome to the club. I wouldn’t waste my time worrying about it.

    0
    #203089

    Robert Butler
    Participant

    What it means is that your data distribution isn’t a normal distribution. Given that your process is going to truncate your output at both the low and the high end this isn’t terribly surprising. From the standpoint of what it means to your customer – nothing. From the standpoint of what it means to you – see Mike Carnell’s post.

    0
    #203081

    Robert Butler
    Participant

    You don’t use the dummy variable designations in the DOE. What you use are the categorical identifiers 1,2,3,4 and, as noted, you cannot fractionate these. What you can do is run a saturated design for each of the categorical levels or, again as noted, if there is some way to rank the categorical identifiers then you would use just the two extreme levels and treat the variable as you would any other two level factor.

    0
    #203073

    Robert Butler
    Participant

    If that is what you want to do then I wouldn’t recommend using either one. If you want to know the relationship between the two then you should look at a regression along with the 95% confidence intervals for individuals. This will give you an estimate of the change in employment status as a function of change in competency.

    Based on what you have said employment would be the response and competency would be the predictor. The other thing you really should do is get a plotting routine that will allow you to jitter the plotted points. This will give you an excellent visual understanding of the relationship between the two. If you don’t jitter the data points you will just have a graph with a bunch of single data points corresponding the the 1-5 combinations of the two rating scales. If you can’t jitter then you should at least be able to annotate the graph so you can indicate the number of data points at each of the 1-5 combinations.

    0
    #203070

    Robert Butler
    Participant

    There are a number of answers to your question. Before trying to offer anything more could you provide more information on what it is that you are trying to do. For example – you say you are trying to look for a correlation between the answers to two questions- does this mean you want to see if one predicts the other, or is it a case of looking for internal consistency, or is it a case of looking for agreement, or something else. Each one of the above objectives will result in a different answer.

    At the moment, my personal guess with respect to what you are trying to do is assess agreement. In that case you would want to look at things like a Kappa statistic and not an assessment of correlation.

    0
    #203067

    Robert Butler
    Participant

    So, what are the variables you think matter and what have you done with respect to examining production data to at least justify your variable choice? If you have a list – put together a screening design – run the design and use the Box-Meyer’s method to analyze the results to identify the big hitters with respect to variation.

    Oh yes – what does “crores” mean?

    0
    #203065

    Robert Butler
    Participant

    There are two schools of thought concerning ordinal variables – one says you can’t use standard statistical techniques to analyze them and the other says you can. To the best of my knowledge neither school has been proved to be either right or wrong.

    As written it sounds like you have 3 or more ordinal responses and you want to test their correlation with some group independent variables. The usual methods of regression analysis will work.

    One thing to bear in mind, if you are running a regression analysis with an ordinal as a dependent your residual plots will have a banded structure to them. This is to be expected and it is not a problem.

    0
    #203060

    Robert Butler
    Participant

    Sure – it’s called dummy variables and it is my understanding that there are commands in Minitab which will allow you to do this without having to write separate code.

    Before going there however, you say you have “a needle position” – is there any way to characterize the position in terms of location such as height relative to a baseline or some such thing? If there is then you can treat the variable as continuous.

    Dummy variables – in your case you have 4 settings therefore you re-code the settings as follows:
    Setting 1: D1 = 0, D2 = 0, D3 = 0
    Setting 2: D1 = 1, D2 = 0, D3 = 0
    Setting 3: D1 = 0, D2 = 1, D3 = 0
    Setting 4: D1 = 0, D2 = 0, D3 = 1

    and you run your regression on whatever variables you have in your list and instead of the variable setting you use D1,D2, and D3. The coefficients for these three variables will be whatever they are and the way you determine the effect of any of the settings is to plug in the combination of D1,D2, and D3 values associated with a given setting.

    If you want more details the book Regression Analysis by Example by Chatterjee and Price has a good discussion of the method.

    0
    #203052

    Robert Butler
    Participant

    Before I go any further please understand I’m not trying to be sarcastic, mean spirited, or condescending. I have two concerns here – first is your effort to finish a master’s thesis (I know how much time and effort one of these things take – I have a couple of Master’s Degrees myself ) and second is to provide you with some suggestions concerning a resolution of the problem you are trying to address should you want to try DMAIC.

    With this in mind I would say the following: based on your description of what has been done to date it is not DMAIC and I think trying to recast what has been done by tacking on some afterthought investigations in order to call what was done DMAIC is a waste of time.

    What you can do with what you have: Basically someone did some kind of back-of-the-envelope assessment (“a rough previous analysis”) and decided that the issue was employee satisfaction. I realize you said “this is probably partly due to differences in employee satisfaction” but the actions you have described indicate you and/or the people you are working for believe this is THE issue.

    Based on what you have said it would appear you have the means to test this conjecture and put it to rest one way or the other. You have prior productivity measures, someone decided the issue was employee satisfaction, to this end a shop floor board was introduced and…what happened? You must have productivity measures after the board was introduced … so is there a significant difference in pre/post shop floor board implementation or not? If there isn’t then I wouldn’t hold out much hope for the idea that employee satisfaction is the driver.

    If you can’t do a proper DMAIC I’d recommend trying the above, doing the minimum additional work as per your adviser’s requirements, write up what you have, get your degree, and press on.

    On the other hand, if you have time to follow DMAIC, I would offer the following:

    1.You said “the process is internalized and presented as SIPOC”. SIPOC is a nice tool for getting people to give some thought to what might go into a process but it is not a substitute for the Define phase. As for “internalized” I don’t know what that means.

    2.You said someone did a “rough previous analysis.” My experience with these kinds of analysis is that they aren’t. Rather they are a reflection of what has been referred to as a B.O.G.G.S.A.T. analysis (Bunch Of Guys & Gals Sitting Around Talking) and all they contain are impressions of WELL-MEANING people who have no intimate day-to-day contact with the process itself.

    To really understand the process you will have to personally walk the line (by yourself, no managers present), take the time to talk to everyone in the working environment (take copious notes while talking), and actually ask them to describe what they are doing as well as ask their opinion concerning issues that might bear on a difference in productivity between the two shifts (this means talking to both first and second shift). There is, of course, more to the Define phase but this is key and it provides a solid foundation for the rest of the Define effort.

    3.You mention “the morning shift is much better than the other shift.” For me, a statement like this screams process not people. Regardless of employee satisfaction, random chance says there should have been occasions when second was better than first. From the process standpoint, after I had walked the floor and employed the rest of the tools used in the Define phase, I would want to think about issues like the following:
    a.Is there a difference in the makeup of the shifts? For example: is the bulk of the first shift comprised of experienced workers and the bulk of the second shift comprised of new hires? Is second shift in any way considered/treated/perceived as a punishment assignment?
    b.Is there a difference in the supply chain between shifts (raw material deliveries are ready and waiting for first shift but second shift has to go out and pull new material from the store room, restock various aspects of the operation , clean up after first shift before proceeding, etc.).
    c.With regard to supply chain – are the raw materials provided to first and second uniform – that is same supplier, same material lots, same location in the factory setting, etc.
    d.Is the break between shifts clean – that is, does the performance of the second shift depend in any way on what was done on the first shift? If so, what is the situation concerning hand-off – and how smooth is it?
    e.Is there any aspect of the production cycle that depends on results from the QC lab? If so, is the lab manned to the same level for both shifts? Is there any difference with respect to response times between first and second shift? Is there any difference in the frequency of sample testing/submission between the two shifts?

    Obviously there are many more issues than what I’ve listed above. What I’ve tried to do is provide you with a quick summary of some of the things I’ve seen in the past that, when discovered, proved to be the real driver between differences in shift performance.

    0
    #203042

    Robert Butler
    Participant

    If your post was, by choice, extremely brief and:
    1. If you have run a survey with meaningful measures
    2. If you have determined production/employee hour is a good measure of the process.
    3. If you have defined the process and really understand how it works
    4. If you have determined the key issues with respect to improving production/hour are all under the control of the employees.

    Then what you have at the moment is a baseline measure. Your next step would be to look at the results of your survey, analyze those findings to identify some aspect of the process to change, make the change, and then repeat your measurements at some later date to see if it made a difference.

    What concerns me is the following:

    1. There is no indication in your post that you have any understanding of the process – no mention of block diagramming, fishbone assessment, quantification of issues concerning quality of quantity of product produced, and no indication that the measure of quantity produced/employee hour is of any value.

    a.If the issue is problems with the process itself then all of the employee motivation in the world isn’t going to change a thing and any correlation you might get between some aspect of your survey and quality change would be due to dumb luck.

    b.Quantity produced/employee hour doesn’t seem to me to be of much use. For example, you could ramp output through the roof but if the defect rate increases proportionally then you haven’t done anything – just increased the rate of defect production. You could easily have an improvement with the same level of output/employee hour with a reduction in scrap/defect rate.

    2. If we assume you have defined the process and know it is strictly an issue of employee motivation as defined by you then how did you build the survey, what did you include in it, and why did you choose the measures you did?

    a.Your quick list of motivation measures was, of course, not all inclusive but I find it interesting it didn’t include a single tangible source of motivation (larger percentage pay increases, improvement in benefits, etc.)

    b.Have the questions been constructed in a way that they will permit the identification of aspects of the employee experience which can actually be changed.

    If you don’t know the process, if you don’t know the correctness of your measure, and if your questions identify things that can’t be changed then you need to start over.

    0
    #203039

    Robert Butler
    Participant

    You will have to provide some more detail because, as written, your questions are not clear.

    For example:
    1. You say confidence level 95 vs 98. – This can be taken several different ways and each of them will result in a different response.
    a. You mean you want to compute the confidence interval around your measurements of process output and you are debating whether or not to compute the 95% or the 98%. If this is what you mean then most of what you have written concerning 95 vs 98 doesn’t make sense. If what you want is the confidence interval for ordinary process variation then the “usual” choice is plus/minus 3 standard deviations which corresponds to a 99% CI. In this case there isn’t much of an issue with respect to sample size.
    b. You mean you want to be able to detect a shift from some proportion relative to either a proportion of 95% or 98%. To do this you are going to have to state the size of the shift you want to detect and the degree of certainty (the power) that a shift has indeed occurred in order to compute a sample size.
    c. You mean something else entirely.

    2. You said “response distribution – while I have recommended 50%….” – what does this mean? In particular – what do you mean by a “response distribution” and how does 50% relate to this?

    3. You said “for low volume processes where the sample size calc throws a low sample” – as written this makes no sense at all.

    If you could expand on some of your questions and provide clarification perhaps I or someone else could offer some suggestions.

    0
    #203027

    Robert Butler
    Participant

    You’r welcome – what did you find out?

    0
    #203012

    Robert Butler
    Participant

    I haven’t run the calculation but if the answer is .21 then it looks like there is a typo in the multiple choice column – perhaps c) is supposed to be .212.

    0
    #203009

    Robert Butler
    Participant

    So what number did you get? Show us your calculations and we can probably show you where you went astray.

    0
    #203005

    Robert Butler
    Participant

    Yes you can. There are no restrictions on the distributions of the X’s or the Y’s in a regression analysis. As for “never” using dollar values in a regression equation – how would you propose to identify an optimum combination of ingredients for, say a plastic compound, such that the dollar cost per pound was minimized other than using the X’s to predict the final dollar cost. In short – perhaps someone has told you this is never done but whomever offered that opinion is wrong.

    0
    #202997

    Robert Butler
    Participant

    Entering the phrase “what is the meaning of p-value” on Google give 392,000,000 hits – a quick check of the first 10 entries looks like they have the answers to your questions.

    0
    #202995

    Robert Butler
    Participant

    There is no requirement for data normality with respect to using control charts. – See Wheeler and Chambers Understanding Statistical Process Control 2nd edition pp. 76 Section 4.4 Myths About Shewhart’s Charts – Myth One: it has been said that the data must be normally distributed before they can be placed upon a control chart.

    For calculation of Pp/Cp values for non-normal data see Measuring Process Capability by Bothe – Chapter 8 Measuring Capability for Non-Normal Variable Data.

    For regression analysis – there are no restrictions on the distributions for either the X’s or the Y’s. Applied Regression Analysis 2nd Edition, Draper and Smith pp. 12-28 – or, if you are in a rush – bottom of pp-22-top of pp-23.

    For t-test’s and ANOVA – both are robust with respect to non-normal data. The Design and Analysis of Industrial Experiments 2nd Edition Owen L. Davies pp. 51-56

    You should be able to get all of these books through inter-library loan.

    0
    #202994

    Robert Butler
    Participant

    It’s nice but it has a number of faults and it places restrictions where none exist.

    1. The decision tree for variable data normal yes/no is wrong. Both the t-test and ANOVA are robust with respect to non-normal data so the roadmap with respect to these test is wrong. The roadmap does correctly identify the fact that Bartlett’s test is very sensitive to non-normality unfortunately it implies that Levene’s test should not be used with normal data – which is wrong. Levene’s test can handle normal or non-normal data and it is for this reason that it is the test used by statisticians – I don’t know of anyone in the field who uses or recommends Bartlett’s test anymore.

    2. The decision tree for X > 1 is just plain wrong. You can use any of the methods on the left side of that diagram when there is only one X. Whether you choose to use logistic/multinomial regression or some kind of MxN chi-square table to address a problem with respect to attribute data will depend on what it is that you want to know.

    3. The warning about unequal variances and “proceed with caution” in the lower left and lower right boxes is nice but it is of no value – in particular it does not tell you what to do or how to check to see if the existence of unequal variances matters.

    0
    #202935

    Robert Butler
    Participant

    As I stated before – the data can be treated as continuous and you can run an analysis on the data using the methods of continuous data analysis.

    The basic calculation for Cpk DOES require data normality which is why there are equivalent Cpk calculations for non-normal, attribute, and other types of data. With your data you will need to look into Cpk calculations for non-normal data – Chapter 8 in Measuring Process Capability by Bothe has the details.

    When testing for mean differences t-tests and ANOVA are robust with respect to non-normality and can be used when the data is extremely non-normal – a good discussion of this issue can be found on pages 51-54 of The Design and Analysis of Industrial Experiments 2nd Edition – Owen Davies.

    Variance issues

    When it comes to testing variance differences the Bartlett’s test is sensitive to non-normality. The usual procedure is to use Levene’s test instead.

    If the VARIANCES are HETEROGENEOUS the t-test has adjustments to allow for this as well. Indeed most of the canned t-test routines in the better statistics packages run an automatic test for this issue and make the adjustments without bothering the investigator with the details.

    Too much HETEROGENEITY in the population variances can cause problems with ANOVA in that if the heterogeneity is too extreme ANOVA will declare a lack of significance in mean differences between populations when one exists. When in doubt over this issue one employs Welch’s test for an examination of mean differences in ANOVA.

    Actually, when in doubt about variance heterogeneity you should do a couple of things, histogram your data by group, compute the respective population variances, run ANOVA using the usual test and Welch’s test and see what you see. If you do this enough you will gain a good visual understanding of just how much heterogeneity is probably going to cause problems. This, in turn, will give you confidence in the results of your calculations.

    Control Charts

    As for control charts – data normality is not an issue. A good discussion of this can be found in Understanding Statistical Process Control 2nd Edition Wheeler and Chambers in Chapter 4 starting on page 76 under the subsection titled “Myths About Shewhart’s Charts.”

    Regression

    While not specifically mentioned in your initial post – data normality is also not an issue when it comes to regression. There are no restrictions on the distributions of the X’s or the Y’s.

    Residuals need to be approximately normal because the tests for regression term significance are based on the t and F tests. But, as noted above – there is quite a bit of latitude with respect to normality approximation. For particulars you should read pages 8-24 of Applied Regression Analysis 2nd Edition by Draper and Smith and Chapter 3 of the same book “The Examination of the Residuals.” For an excellent understanding of the various facets of regression I would also recommend reading Regression Analysis by Example by Chatterjee and Price.

    I would recommend you borrow the books I have listed (the inter-library loan system is your friend) and read the sections I’ve referenced.

    0
    #202928

    Robert Butler
    Participant

    You can treat your data as continuous.

    “Variable are classified as continuous or discrete, according to the number of values they can take. Actual measurements of all variables occurs in a discrete manner, due to precision limitations in measuring instruments. The continuous-discrete classification, in practice, distinguishes between variables that take lots of values and variables that take few values. For instance, statisticians often treat discrete interval variables having a large number of values (such as test scores) as continuous, using them in methods for continuous responses.”
    – From Categorical Data Analysis 2nd Edition – Agresti pp.3

    As for running t-tests – they are robust with respect to non-normality. If you are worried about employing a t-test – run the analysis two ways – Wilcoxon-Mann-Whitney and t-test and see what you get. The chances are very good that you will get the same thing – not the exact same p-values but the same indication with respect to significance.

    0
    #202908

    Robert Butler
    Participant

    There are ways to test the significance for a percent change the problem is that before anyone can offer anything you still need to tell us what you have done and how you want to test the percent change.

    1. You said you have a 3rd column with a percent change – how did you do this computation? For example:
    a. did you just put male and female columns side-by-side, take a difference and compute a percent change using one of the two columns as a reference?
    1. If you did a) then what was your criteria for pairing the two columns? If you just randomly smashed the two columns together and took a difference then the percentage changes are of little value.
    Or did you use some other method. Regardless, we need to know the computational method you used and what is being compared to what.

    2. You said you want to know if this percentage change is significant. Based on your earlier posts it appears that you do not want to test to see if it is different from 0 therefore, what is the target? That is, what target percent change do you want to use for comparison?

    Example: you have a precious metal recovery system that recovers 99.986% of all of the metal from scrap. You have a new method which you believe will recover 99.990% and you have a target for significant change of .002% – under these circumstances if the new process delivers as expected the percent change will be significant.

    0
    #202904

    Robert Butler
    Participant

    @jazzchuck, I could be wrong but I think the OP is after what I commented on in my first post. He/she has, in some manner, computed a percent change and is looking for significance with respect to that difference where the comparison of the generated percentage change is with something external to the data.

    The issue isn’t that of difference between the groups – as you and others have noted – that information is already part of the t-test. As written the OP does not seem to have anything that would justifying pairing individual measures so a paired comparison would be of no value. In any event the question usually addressed by pairing – is the mean of the differences significantly different from 0 – does not appear to be of interest to the OP.

    I think if the OP provides a description of how they generated the estimate of percent change someone might be able to offer some insight with respect to answering the original question.

    0
    #202892

    Robert Butler
    Participant

    Years ago a company I worked for was getting beat up with respect to OSHA recordable accidents. The company had tried a number of things over the years. Among the ongoing efforts was an annual shoe fitting for safety shoes for everyone regardless of their position. These shoes were expensive and the company not only paid for the shoes but for the time lost to mandatory shoe fitting and inspection.

    The company had a monthly plant wide meeting to discuss company issues. At the meeting where the latest OSHA statistics were reported the presentation consisted of the usual, here’s what we did last year and here is what we did this year. During the question and answer session I noted that all that had been presented was the data for just 2 years and, as the statistician, I offered to look at all of the OSHA data to see if I could perhaps find some trends that might help us better understand the accident situation.

    After the meeting, one of the safety engineers gave me all of the OSHA data as well as all of the non-OSHA accident data for the last 20 years. I ran an analysis and discovered the following:
    1. The nature of the physical work done at the plant had not changed substantially over that 20 year time frame.
    2. In 20 years there had not been a single OSHA or non-OSHA accident related to legs or feet (and for the first 10 of those years the company did not have a mandatory safety shoe policy in place).
    3. During the entire 20 year span the primary OSHA accident consisted of 2nd degree burns to the fingers, hands or lower arms.
    4. During the entire 20 year span the next largest OSHA accident consisted of cuts requiring one or more stitches. Those cuts occurred on either the fingers or the hands and were due overwhelmingly to improper handling of glass tubing.
    5. The overall 20 year trend in OSHA accidents had been down, steeply down, and the “big” difference between the most recent year and the prior year amounted to nothing more than ordinary process variation.

    When I presented my findings several things happened.
    1. The safety shoe program ended.
    2. The entire plant knew what the two biggest issues with respect to OSHA safety were and had been for the last 20 years.
    3. Additional focus/emphasis was placed on wearing proper gloves and lower arm protection when working with anything from the heating ovens.
    3. The company purchased puncture resistant gloves for glass tubing and everyone involved in handling glass tubing received additional safety training.

    The end results:
    1. OSHA recordables dropped to 0 for several years running.
    2. As had been the case for the prior 20 years no one had a foot or lower leg injury.
    3. The cost of the shoe program was eliminated.
    4. The time involved in fitting and safety shoe inspection was used for other purposes.

    0
    #202891

    Robert Butler
    Participant

    Since you only have a single percentage change the only way to assess significance is to have something to use as a yardstick. So the question becomes significant compared to what?

    0
    #202882

    Robert Butler
    Participant

    I’m afraid I don’t have an answer to that question. I’ve never done either. My GUESS would be it wouldn’t matter. The problem is how are you going to present your Cpk in terms of your original distribution. If you run a capability analysis on transformed data the chances are very good you will not be able to back transform to actual units. What this means is all of your reporting will be in transformed units and I do know from personal experience this does not work very well. People don’t understand that you have transformed the data, they don’t understand the Cpk is in terms of the transformed data, and they don’t understand you can’t apply the Cpk to the actual system measurements.

    One of the other problems I have with box-cox and Johnson transforms (and that is all of the box-cox and Johnson transforms) is the issue of the change of transform parameters over time. As you gather more data there is a good chance your transform parameters will change which means you have to go back, recalculate the new parameters and then adjust all of your reporting from the past to make it match with the present.

    Of course, you can have a process where the addition of more data does not have a major impact on the parameters but if you don’t then you may find yourself up a well known tributary without sufficient means of propulsion.

    0
Viewing 100 posts - 1 through 100 (of 2,399 total)