iSixSigma

how is DOE different from Regression

Six Sigma – iSixSigma Forums Old Forums General how is DOE different from Regression

Viewing 70 posts - 1 through 70 (of 70 total)
  • Author
    Posts
  • #49796

    franky
    Participant

    Just like the subject says….help please!

    0
    #170758

    Kumar
    Participant

    Franky,
    DOE in the form we use has regression embedded in it.  DOE is used to generate data in a most efficient fashion.  You perform regression on this data to get transfer function.
    This is the crux..rest is detail.
    ravi Pandey

    0
    #170767

    Robert Butler
    Participant

    DOE is nothing more than rules for gathering data.  Regression is one of the tools you can use to analyze data thus gathered.  The value of the rules of DOE vs. say happenstance is that data gathered using the rules of DOE provides certain guarantees with respect to the variables you would like to treat as being independent of one another.

    0
    #170768

    Sinnicks
    Participant

    Robert,What is the difference between independent and interacting variables.Cheers,
    Mark

    0
    #170770

    SS
    Member

    DOE needs to follow regression. In regression you check whether certain X’s and combination thereof has an impact on Y and how significant is the relationship. In DOE you will take those X’s and see how much does this X’s has impact on Y and through response optimazation you determine the best possible combinations. My two cents.
    Also in regression both X and Y are continous however in DOE mostly your factor would be discrete and Y discrete or continous.
    In regression you cannot check for multicollinearity but in DOE you can…

    0
    #170771

    Sinnicks
    Participant

    SS,Did you mean to reply to my post to Robert? If you did, I don’t see how your post answers my question.Mark

    0
    #170772

    Robert Butler
    Participant

     DOE is a method for gathering data. It is not connected to or driven by regression in any way.  Regression is one tool you can use to analyze the data you gathered using the method of DOE. 
      Because it is a series of rules for gathering data DOE doesn’t check anything.  When you build a DOE you construct it so that when the data is gathered and analyzed the X’s of interest will not exhibit collinearity but in order to test for multicollinearity (for example, in the case when one or more of the experiments failed to run) you will need to use regression and the methods of regression diagnostics to test the degree of multicollineaity (confounding) and whether or not it is an issue.
      The rule concerning the need for X and Y to be continuous before running a regression is a very conservative piece of boiler plate that is taught in intro BB courses to make sure you don’t many any egregious mistakes when you are starting out.  In fact X and Y can be either discrete or continuous but you have to know what to do if they are.  The posts below provide some information in this regard.
    https://www.isixsigma.com/forum/showmessage.asp?messageID=59278
    https://www.isixsigma.com/forum/showmessage.asp?messageID=109369
     

    0
    #170775

    Outlier, MDSB
    Participant

    SS,
    Some of what you said is true, but some of it is either wrong or misleading. Your first paragraph is essentially correct.
    Be aware that there is a subtle but important difference between “multicollinearity” and “interactions.” Two inputs can interact to have a greater impact on Y but still be independent of each other.
    Your second paragraph is misleading. In Linear (and Multiple Linear) Regression, both X and Y are continuous. There is also Logistic Regression in which your Y is discrete and your X can be discrete or continuous. In DOE you can have either discrete or continuous inputs or outputs. I see many DOE’s with only continuous factors.
    Regarding your last sentence, you can absolutley check for multicollinearity with multiple regression, in fact you MUST check for it to insure your regression model is valid. With DOE, when you set up your design structure, it is the orthogonal design that insures independence of your variables.
    Some other difference between DOE and Regression:
    Regression uses “passive” data, meaning you are analyzing data after it was collected. With DOE your data collection is “active” meaning you plan how and when you will collect your data in advance.
    With DOE you have blocking and other noise treatments to insure the signal you see comes from your factors. With Regression you have limited noise control due to the passive nature of the data collection.
    With DOE you have more control over the measurement system accuracy than with regression.
    Both methods have a goal of creating a predictive equation.
    Both methods could have “lurking variables” meaning some outside factor that you haven’t controlled for is impacting your Y.
    Hope that was helpful.
     
     

    0
    #170776

    Robert Butler
    Participant

      When you say “interacting variables” I’m assuming you mean an interaction involving two or more variables.  If this is the case then there isn’t anything to differentiate.  If you have two variables and you express their settings for each experiment in matrix form then the interaction is just the multiplication of the two columns representing the settings of the two variables.  This column, like the column for each of the two variables, identifies the way in which you are to use the data from the individual experiments to determine the size of the effect of the interaction. 
      The interaction may be independent of the two variables of interest or not -whether it is will depend on the degree of orthogonality between the two variables of interest.
    For example if we have 2 variables and their settings for 4 experiments are
    Exp  A  B   AB
    1    -1  -1     1
    2     1  -1    -1
    3   -1    1    -1
    4     1   1      1
     
      Then the diagnostics indicate that A and B are orthogonal and thus so is AB relative to A and B. This independence means all three terms can be use in the model.  On the other hand if the matrix is
    Exp  A  B   AB
    1    -1  -1     1
    2     1  -1    -1
    3   -1   -1     1
    4     1   1      1
      Then A and B are not perfectly orthogonal and the diagnostics indicate AB = intercept -A + B so AB is not independent of A and B. A and B are “reasonably” independent of one another so a model could be examined which only had A and B as “independent” terms.

    0
    #170795

    Ron
    Member

    Sometmes the best answer is the simplest answer:
    Regression is the study of data taken from past performance ,historical data, A DOE is a planned activity with predefined combinations of factors that must be followed in a prescribed random manner.
    Regression is easy and cost effective but limited in its ability to define the key factors,
    A DOE is designed to determine the key factors.

    0
    #170798

    Sinnicks
    Participant

    Robert,The point I’m trying to understand is whether or not two variables can be both independent and interacting. By interacting I mean the variables do not have a linear relationship.PS: You might have already explained my question in terms of the matrices, but I’d find it easier to understand if you could answer my question using the terms I’ve used.Thanks,
    Mark

    0
    #170799

    franky
    Participant

    Outlier—
     
    that was a great explanation…thank you

    0
    #170800

    Robert Butler
    Participant

    Hmmmm….  I guess I’ll need some more details.  If interacting means “two variables do not have a linear relationship” I need to know what you mean by this statement.
    Let’s say we have Y and X1 and X2 where we want to look at the relationship  Y = some function of  X1 and X2. Then your statement could mean any of the following:
    1. X1 and X2 are related either in a linear fashion so that
    X1 = b0 +b1*X2 . If this is the case then they are linearly related and they are not independent.
    2. X1 and X2 are related in some curvilinear way so that
    X1 = b0 +b1*X1*X1. If this is the case then they are not linearly related (that is the relationship is not a straight line) and they are not independent.
    3. X1 and X2 are not some function, regardless of type, of one another and say Y = b0 + b1*X1*X1 +b2*X2*X2.  Then X1 and X2 are independent and their relationship with Y is not a straight line – that is it is not linear.

    0
    #170801

    Vargas
    Member

    Robert, Can you help me to understand the meaning of orthogonality (as in “degree of orthogonality”) Thank you,Srini 

    0
    #170831

    Sinnicks
    Participant

    Thank you for your patience Robert …What I meant by linear is a linear combination of x1 and x2, e.g:y = x1 + x2If on the other hand x1 = x2, then the equation above reduces toy = X2^2 and y is not linear.———I’m now curious about the interaction term: y = x1*x2, which I believe by the same thinking is non-linear.My question is can x1*x2 exist when x1 and x2 are completely independent – in other words when they are completely orthogonal?I’ve thought about this on and off for several years, but the only case I can think of is the unrealistic mathematical case when on the interaction diagram of y = x1*x2, x2 at level 1 is parallel to the x1 axis and x2 at level 2 is vertical to he y axis.This seems to me to be the only truly independent case for the interaction term x1*x2 and unrepresentative of any process I’ve ever seen.If my thinking is correct, then the use of an interaction column is an admission that some variables are probably not truly orthogonal, and of course they should be :-)What do you think?Cheers,
    Mark

    0
    #170835

    Robert Butler
    Participant

      If X1 = X2 then the equation y = X1 + X2 would reduce to y = 2*X1 or y = 2*X2 not y = X2^2.
      As for an interaction being non-linear and independent  -first, if we are talking about a case where X1 and X2 have only two levels and are completely independent (that is orthogonal) then the interaction will still be linear and the interaction will be independent of X1 and X2.  If the variables have 3 levels then the relationships could be either linear or curvilinear.
       You will have to think in terms of matrix arrays in order to understand the  issues concerning indepndence (and orthogonality) of the three (X1, X2, and X1X2). I addressed this in an earlier post to this thread.
    https://www.isixsigma.com/forum/showmessage.asp?messageID=139371
      Wirh respect to orthogonality you should note that two terms are orthogonal to one another if one column cannot be expressed as a linear combination of the other columns in the array.  In second the example in the cited post the way you do the addition for the array to show the lack of independence of AB relative to A and B is the following:
    Exp  A  B       (intercept -A +B)                        AB
    1    -1  -1           1 -(-1) +(-1) =                         1
    2     1  -1           1 -(+1) +(-1) =                        -1
    3   -1   -1           1 -(-1) +(-1 )=                          1
    4     1   1            1 -(+1) +(+1) =                        1

    0
    #170838

    Sinnicks
    Participant

    Thanks for correcting my mistaken thinking. I can now see that an interaction can be both linear and independent.However. I’m still left with the difficulty of picturing an independent interaction.Do you agree the independent interaction y = x1*x2, would have to have X2 = low, parallel to the x1 axis, and x2 = high, parallel to the y axis?In the meantime I’ll study your other post.Cheers,
    Mark

    0
    #170840

    Robert Butler
    Participant

    No,  Rather than try to describe this just plot the following:
    A    B     Y
    -1  -1      3
      1  -1       4
    -1   1        6
      1  1        12
    plot Y against A where you have held B constant (you should have 2 lines one for B = -1 and one for B = 1)  note that the two lines are not parallel to one another.  The lack of parallel lines when you shift from a low to a high B value is a graphical indication of the existence of an interaction.  If you test these numbers in Minitab (toss in a 5th point
    -1,-1, 2.9 just so Minitab will have some error to work with) and you will see that A, B, and AB are significant.

    0
    #170841

    Robert Butler
    Participant

    If we use the posts to this thread as a measure then there seems to be a lot of confusion concerning DOE and regression.  All of the statements below have been offered in this discussion and all of them are either in error or misleading.
     
     “Also in regression both X and Y are continous however in DOE mostly your factor would be discrete and Y discrete or continous.”
    “In regression you cannot check for multicollinearity but in DOE you can”
    “Regression uses “passive” data, meaning you are analyzing data after it was collected. With DOE your data collection is “active” meaning you plan how and when you will collect your data in advance.”
    “With DOE you have blocking and other noise treatments to insure the signal you see comes from your factors. With Regression you have limited noise control due to the passive nature of the data collection.”
    “With DOE you have more control over the measurement system accuracy than with regression.”
    “Regression is the study of data taken from past performance ,historical data”
    “Regression is easy and cost effective but limited in its ability to define the key factors”
     
    As I’ve already said DOE is a method for data gathering. Regression is a method of data analysis.
    Some Citations
    “The design of an experiment has been defined very simply as the order in which an experiment is run…Care will be taken to emphasize three important phases of every project: the experiment, the design, and the analysis” – Hicks – Fundamental Concepts in the Design of Experiments – 2nd Edition.
    Note the distinction between design and analysis in the above.
    Factorial Designs “In this field the design and the analysis of the results are closely linked.” – Davies – The Design and Analysis of Industrial Experiments – 2nd Edition  
    Note again the distinction and also note the distinction in the title of the book.
    Regression is a series of algebraic manipulations that can be applied to any kind of data.  This includes happenstance data – that is data that was not gathered following the rules of a design, and design data –data that was gathered following the rules of a DOE.
     Regression does not control anything. Regression cannot control noise but regression analysis can quantify the amount of noise in a block of data. The purpose of regression is to quantify the correlations between one or more variables identified as responses and one or more variables identified as predictors of those responses.
    This may seem like so much statistical nitpicking but lack of clarity with respect to understanding the function of the tools can lead to concluding things that are completely wrong.  Specifically, on a number of occasions I’ve been brought into an effort and done all of the usual things one is supposed to do in the Define and Measure phase.  After careful analysis I’ve reached the point where I thought a design would be appropriate.  I make the proposal for the design and I’m met with a complete refusal.  When I ask why I’m told – “We tried that design analysis stuff once. We ran one and it didn’t identify anything that was any real improvement over what we currently have.”  When I ask to see the analysis of the design the response is along these lines – “The design was the analysis and it didn’t show anything.”  When pressed further they will tell me that after they ran the design they looked at the results from the experiments and since none of the experiments run in the design was any kind of an optimum/improvement they concluded that the design had failed.  What they failed to understand was that the design only prescribed the way in which the data was gathered – their analysis was just the eyeballing of the data…and I would agree that that analysis failed to extract the information buried in the data.
     
      From time to time I find that the data from the design is still available.  Under those circumstances I request permission to actually analyze the data generated by following the rules of the design and every time I was able to use the resultant predictive equations to either identify an optimum or point in the direction where the optimum was most likely to be found. On more than one occasion it has turned out that all of the work on the problem in question had already been done – it was just that no one understood that they had to analyze the data from the design in order to extract the information it contained.

    0
    #170842

    Outlier, MDSB
    Participant

    Mark,
    To me the best way to think about the concept of independent variables interacting is to try to think of a real, easy to understand example.
    Say you were conducting an experiment to determine how to get the best fuel mileage from your car. Y is your MPG and is continuous data. Let’s say that you have found that the speed that you drive is one factor (X1) and tire pressure is another factor (X2). It is pretty easy to wrap your brain around a concept that speed and tire pressure are independent of each other. (Forget for a moment that at highway speeds, tires heat up and that raises the pressure. Assume that the pressure has already normalized.) So we can agree that changing tire pressure does not change speed and speed does not change tire pressure (after the temperature normalizes). But it is completely possible that the right combination of speed and tire pressure can have a greater impact on Y than either factor alone.
    That may not be a perfect example, but it should help you more clearly understand how variables can be independent and still interact to affect the output.

    0
    #170843

    Outlier, MDSB
    Participant

    Well said.
    DOE is a method of planned data gathering. Regression is a method of data analysis.
    I recognize some of my statements as being misleading, probably because I oversimplified. I guess that’s the risk of trying to boil down whole fields of study into a few bullet points.

    0
    #170855

    Anoop
    Participant

    Great answer !! sums it all up.
    you are more likely to use regression analysis in Analyze phase and DOE in Improve phase… I hope I am not oversimplying there either ??

    0
    #170857

    Iain Hastings
    Participant

    Anoop,
    Ron’s answer is not a great answer and it does not sum things up accurately or appropriately. Similarly your statement is wrong and confused.
    I strongly recommend that both Ron and you take the time to read Robert Butler’s comments on this thread. He has spent considerable effort trying to remove confusion, unfortunately it appears to have been with limited success.

    0
    #170858

    Anoop
    Participant

    “you are more likely to use regression analysis in Analyze phase and DOE in Improve phase… I hope I am not oversimplying there either ??”
    Why is this incorrect?? I am asking out of genuine curiosity or perhaps confusion…please elaborate. I did read through the previous answers and I still feel the same.

    0
    #170860

    Robert Butler
    Participant

      This modification of your sentence would be reasonable.
    “you are xxxx likely to use regression analysis in the define and measure phase when examining historical data.  In the Analyze phase you may find it appropriate to use DOE and when you do you may use regression to analyze the data gathered under these conditions. The use of  a DOE in Improve phase may be necessary to confirm your recommendations before full implementation.  Once again, you will probably use regression methods to analyze the data from the DOE. … I hope I am not oversimplying there either ??”

    0
    #170861

    Anoop
    Participant

    Thanks Robert.
    That makes sense. I think it depends on the kind of project but I see what you are saying.
    – Anoop

    0
    #170904

    Sinnicks
    Participant

    Once again thank you for your patience …I plotted the data you posed by hand and confirmed the presence of an interaction – non-parallel curves y1 and y2.Just to clarify, the two curves y1 and y2 correspond to the two levels of x2.Previously, I referred to these two curves and noted as y1 goes from y = 3 to y = 5, y2 goes from y = 4 to y = 12.This seems to imply the two curves y1 and y2 are highly correlated?Therefore y1 and y2 are not independent.In other words, for y1 and y2 to exist it is necessary to have some dependency between x1 and x2.Therefore, it follows if x1 and x2 are truly independent, as they should be: there is no need for a x1*x2 term or column, and we can use saturated designs.Cheers,
    Mark

    0
    #170909

    Sinnicks
    Participant

    Hi,Thank you for your participation …I understand your example, but do not see how you can assume independence if there is an interaction.Imagine a black box: the box has two knobs on the facia – one labeled A and another labeled B.If A and B are independent, I can adjust the process perfectly well using either A or B, or an independent combination thereof.
    If A and B are not independent though, I have to be very careful how I adjust A because the setting of A demands a certain setting of B, due to the AB interaction.What is a reasonable man conclude? The presence of the AB interaction implies a dependency between the A knob on the B knob.Mark

    0
    #170912

    Sinnicks
    Participant

    Outlier,Something weird happened to my earlier reply to your kind post.I’ll do my best to remember my earlier reply in case the first one appears and there is some inconsistency in argument :-)Imagine a black box with two knobs – one labeled A and the other B.If A and B are independent, I can adjust A or B independently to target the process, or any combination of A and B.However, if A and B are not independent and I can’t set A without deference to B’s setting, due to the interaction between A and B, a reasonable man might conclude A and B are not independent.Therefore, ths substance of my argument is the presence of the interaction AB is an acknowledgment A and B are not independent.Do you think this is a reasonable conclusion. If not where is the fallacy?Cheers,
    Mark

    0
    #170934

    Robert Butler
    Participant

      The problem is you are trying to make inferences concerning the relationship between X1 and X2 on the basis of Y.  If you go back to the posts below and re-read the definition of orthogonality and note how it is determined you can see that Y does not enter into the discussion or the equations.  Therefore, Y’s behavior has nothing to do with the relationship (or lack thereof) between X1 and X2.
    https://www.isixsigma.com/forum/showmessage.asp?messageID=139371
    https://www.isixsigma.com/forum/showmessage.asp?messageID=139451
      One additional point – you said “This seems to imply the two curves y1 and y2 are highly correlated?

    Therefore y1 and y2 are not independent.”
      It is true there are two lines but they are not the result of two equations.  They are the result of a single equation and their plot is a function of X1 and X2.  Their slope differences is due to the presence of an X1X2 interaction. If it wasn’t there then they would have been two parallel lines.
    Y = 6.23 +1.76*X1 +2.76*X2 +1.23*X1*X2
      To see this just ignore the interaction term in the above and plot the lines that would be the result of
    Y = 6.23 +1.76*X1 +2.76*X2
      If you want to graphically assess the relationship between X1 and X2 you would plot them against each other  i.e. X1 on the X axis and X2 on the Y axis.  The plot, for the orthogonal matrix we have been using for this discussion will look like the four corners of a box.  If you regress X1 on X2 for the perfect matrix (don’t add the single replicate) you will get a slope of 0 and a correlation of 0 – the same would be true for X1 against X1*X2 or any other combination  of the X terms. With the replicate you will get a non-zero slope but it will not be significant and if you had the capability to test the matrix including the replicate you would find that X1, X2, and X1*X2 are orthogonal with respect to one another.

    0
    #170939

    Sinnicks
    Participant

    I don’t understand why it’s a problem to make inferences about the relationship between X1 and X2 based on their responses. On the contrary, I believe a problem arises when such inferences are not made.As for there not being two equations, I plotted both and I can derive an equation for each. (?)Let me give a practical example for the black box I proposed in another recent post.Consider the exposure setting on a camera, which is a function of Aperture size and exposure time. Can we really claim the aperture setting and the exposure setting are independent of each other? I don’t think so because I have to change the exposure time depending on what aperture I choose. I suspect you would call it a synergistic effect.So what is the origin of the so-called interaction? It arises because our choice of response – the amount of illumination striking the film. The latter is not a good response because illumination is not proportional to a linear combination of aperture size and the exposure time. In other words, these two variables do not act independently. We can call it an interaction if you like, but to just give something a name does not imply the concept is useful.In this case, as is many other cases, the cause of the dependence is inherent in our choice of the response. In other words, the dependency is ‘built-in’ to the response.If I change the response to work – the amount of light energy striking the film (illumination squared) – the interaction disappears and each factor – aperture size and exposure time act independently of each other :-)Mark

    0
    #170946

    Robert Butler
    Participant

      The issue was that of independence of the X’s.  The definition of independence is that of orthogonality and the definition of orthogonality does not involve Y.
        As for the equations – yes, of course you did and you did it by ignoring the second X – that is you built an equation relating Y and X1 using the two data points where X2 was -1 and then you built the second where X2 was fixed at 1.  In other words, you built two equations where your value of X2, for each of the equations, is a constant.  The question would be , given that X1 and X2 are independent -as defined previously, why do this?  If you analyze your data this way you run the risk of missing the interaction of the two variables and, if this were some kind of venture in efficient production of product, knowledge of that interaction could be the difference between profit and loss.
        One area where interactions are a fact of daily life is the world of applied chemistry. Consider sulphur, saltpeter, and charcoal.  They are three very independent and distinct things.  I can vary their ratios in a mix in any way I choose. Once I have a mix I can apply heat to it and for a number of mixes of the three I will get nothing more than a bunch of smoke and some residual goo. 
       If I put together the right combination and apply heat I’ll blow a hole in the side of my house and they will scrape my remains off of the kitchen ceiling.  That result is not due to a simple linear combination of the effects of the three variables- it is an interaction. Interactions are the heart and soul of thousands of very profitable chemical processes.  Without an interaction these processes wouldn’t exist. 
      I’m confused by your discussion of aperture and exposure time since at the start you seem to be saying they aren’t independent but at the end you seem to be saying they are.  However, I’ll offer the following:
     For aperture and exposure time – there is nothing that says you have to change one when you change the other. I do large format photography and I have complete control over how I make those settings.  The response to my settings will be whatever it is – I may decide I want a particular response and so to get it I will make changes in one or both but it isn’t the response that is driving the settings it is my desire to achieve a particular response that will lead me to choose a particular setting…and the only way I can know which setting to choose is because I experimented and varied those settings in some independent way. It is the results of my analysis of those experiments that allowed know what kind of a response I could expect when I used a particular combination of aperture and exposure.
     

    0
    #170952

    Sinnicks
    Participant

    The point I tried to make was the origin of the interaction X1*x2 is a dependency between X1 and X2, because of our choice of response.I can’t comment on your excellent chemical example, because I don’t understand enough about chemical reactions, but isn’t there a fourth component – heat? Who can say what happens to the mixture of three elements in the micro-second before the mixture explodes – perhaps all four components suddenly fuse, at which point they are no longer distinct and independent.As for the photographic example, you can of course create many different kinds of images – some over-exposed and some under-exposed. I had hoped you would assume I was referring to the correct exposure condition. You can either measure the amount of energy falling on to the paper, or you can run an experiment to determine the correct exposure, as judged by edge acutance and grain size.Anyway, thanks for humoring me!Mark

    0
    #170954

    Vallee
    Participant

    Mark,X1 and X2 in your camera example are independent. The increase of one or the decrease of the other does not have an effect on one or the other. Now the manipulation of one or the other can have an effect on the final picture which can then be put into a formula to get the best picture. If you hold one X constant then of course you will have increase or decrease another X if not add another one to the equation like the type of film you select. I like the sweet tea example… cold tea is not good without dissolved sugar in it. The sugar does not dissolve effectively unless put into hot water first; therefore, temp has an effect on sugar’s ability to dissolve. Robert, I usually don’t get involved in your responses because in most cases you make me dig up my old texts books has a refresher. I just couldn’t help myself. HF Chris Vallee

    0
    #170960

    Sinnicks
    Participant

    HF,EV = log2(N^2/t)Perhaps Robert would be kind enough to explain it to you.
    As for myself, I don’t think debate is dead, and I think there is still plenty to debate about in the world of Six Sigma.
    Mark

    0
    #170962

    Robert Butler
    Participant

       I think we might be getting bogged down in semantics here but if your point is best summarized by the first sentence of your most recent post – “the origin of the interaction X1*x2 is a dependency between X1 and X2, because of our choice of response.” then maybe we have a way out of this.
      If you know the relationship between the X’s and the Y because of prior work and you have quantified that is some fashion (say a predictive equation) then you can use that equation to identify those combinations of X1 and X2 that will give you a preferred Y. 
      In that kind of after-the-experiments-have-been-analyzed-and-the-equation-built sense, yes Y is driving the choice of X1 and X2 but it isn’t driving the relationship because when you did the initial work to identify the relationship between Y and X1 and X2 you did it by changing X1 and X2 and recording whatever value of Y resulted – not the other way around.
      An aside – heat merely acts to get the reaction going.  If suplhur, saltpeter, and charcoal are not in the proper proportions you can heat the mix until you are the best customer of the local natural gas company and nothing of any import will happen. 
      An examination of the residue after an explosion will show that it contains sulphur, saltpeter, and charcoal – in fact, it is by examining the residue after an explosion has occurred that forensic experts are able to tell what kind of an explosive was used. 
      The reaction is that the charcoal provides fuel to the sulphur and the saltpeter kicks in with a huge dump of oxygen.  In the right ratio the oxygen dump is so overwhelming that what you get is a very fast burning fire. If a fire burns fast enough we call it an explosion.
        

    0
    #170966

    Vallee
    Participant

    Mark,I understand the formula; however it is still an optimized manual process that is effected by the end result need not by the dependency of one on the other. Key word here being dependency. HF Chris Vallee

    0
    #170968

    Sinnicks
    Participant

    Robert:I really appreciate your contributions – I’m just trying to stimulate debate and reasoned argument.Yes, I fully understand your position and I really appreciate you’re willingness to consider my counter arguments to what is essentially Box’s position.Your use of the chemical argument for interactions it is particularly difficult to overcome from the Taguchi perspective, but rather than dismiss him out of hand, I’ve tried to walk in his shoes and see his world-view.As far as I’m concerned, you’re pretty unique amongst posters on this site, and while it is not my intention to belittle anyone else, there are many others who appear to take the view – it’s like this because so-and-so said so. How can these people teach in a classroom where difficult questions are par for the course?We now live in a world that seems bereft of reasoned argument and my hope is Mike, Alister and HF Chris will get together and decide what needs to be done to improve Six Sigma.I defer to you in this instance and look forward to debating other issues with you in the near future.Best wishes,
    Mark (an alias for the Grail fool)

    0
    #170969

    Sinnicks
    Participant

    HF,My argument is more easily seen if you change the subject of the formula:For a fixed EV – say the correct exposure, aperture depends on exposure time.Cheers,
    Mark

    0
    #170970

    Vallee
    Participant

    Mark,Thanks, I actually looked up the formula in relation to photography to see the connection. However, if you look at the sugar example with sweet tea, you are not manual changing the properties of tea, the temp of the water is; thus there is a dependency and interaction; whereas in your photo example these are optimal settings dependent on the output desired (double exposure, white light etc.)

    0
    #170980

    Sinnicks
    Participant

    HF,I’ve only had a brief introduction to dissolution rates, and I’m not sure how it applies to sugar and tea.Does the dissolution rate of sugar in tea depend on an interaction between the temperature of the water and concentration of tea? I have no idea.Even if we consider stirring as a factor, how do we know if it is an interaction or a just a linear combination of tea concentration, the amount of sugar, the stirring rate and time, and the water temperature?I previously tried to argue that if all these factors are independent, the interaction term will be zero, and the converse – if the process is described by an interaction, then the factors are not independent due to the choice of the response – in this case the dissolution rate of sugar.If I put my Taguchi hat on again – since I can’t find an equation describing the dissolution rate as a function of the permeability and precipitation of sugar, I’d use sliding levels to avoid any possible interaction, or try to find a response closer to work done (energy.)Cheers,
    Mark (an alias for the Grail fool)

    0
    #170981

    Vallee
    Participant

    Mark,I grew up in the south. Just try to mix sugar in cold water no matter how strong the tea is and it just sits on the bottom no matter how sweet you want the tea to be (the photo quality), there is no optimal mixture (aperture etc.). Now if you take your optimal photo settings you can see a correlation and possible interaction but it it not cause and effect just controlled optimal correlation. If you don’t change your aperture setting it has no affect at all on the exposure time setting, just your photo quality. Not sure if Robert would agree but this has always worked conceptually for me.

    0
    #170986

    Robert Butler
    Participant

        I was wondering if the view you were considering was Taguchi’s. If you are trying to understand his point of view and understand the “traditional” view and reconcile the two you will have to spend some time understanding the mathematics and the manipulation of the elements of a design matrix.  In particular, you will need to understand orthogonal in the mathematical sense and you will have to understand how design matrices are manipulated by a computer to extract the measure of the effect of a variable in the design.
      You will also need to understand that an awful lot of the supposed differences in the two methods is nothing more than word play.  To this last point I’ll offer the links below to a couple of posts of mine.  The first is an overview of some of the issues you might want to consider.  The second series of posts are part of a thread.  I’ve listed the posts in the back and forth order in which they appeared – two are mine and one is from one of the individuals with whom I was conversing.  The entire thread was a good discussion and you may want to read all of it. If not, the three listed are the ones that are central to the issue of method comparison.
     https://www.isixsigma.com/forum/showmessage.asp?messageID=108236
    Discussion series:  We were discussing the issues surrounding the choice of a design to analyze a process.  One of the posters asked for an example of the Taguchi approach.  The the individual who posted the following offered an example from Roy’s book and I offered the “traditional” approach to the same problem
    https://www.isixsigma.com/forum/showmessage.asp?messageID=113397
    https://www.isixsigma.com/forum/showmessage.asp?messageID=113415
    https://www.isixsigma.com/forum/showmessage.asp?messageID=113428
      As I mentioned in the thread, from my personal standpoint, as long as you are thinking design and not one-at-a-time or wonder-guess-putter, I don’t care which design you use.  
    One thing to consider when you are comparing the two is this – if Taguchi doesn’t think there is such a thing as an interaction then why hasn’t he made this belief a part of his designs.  If I have a fractional design of an inner array and a full factorial design for an outer array then that full factorial is needed only if you have an interest in chasing the effects of outer array interactions.  Furthermore, the multiplication of the two designs will result in experiments that are necessary only if you have an interest in checking interaction effects between inner and outer array elements. 
      The same issue with respect to inner and outer array elements holds true if both the inner and the outer array are fractional factorials. So if Taguchi really doesn’t believe in interactions then why hasn’t he, or one of his followers, done the obvious and removed those unneeded experiments (costs) from his design matrices?
      As a final thought please don’t think I’m going to go off on some kind of a Taguchi bash.  I don’t do bashing or flaming – they are a waste of time and computer key strokes.  The only reason for offering the above is because you give me the impression you are trying to get a better understanding of issues and concepts and I hope the above will be of some value.

    0
    #171002

    The Grail Fool
    Member

    HF,If you grew up in the south you ought to know you put your sugar along with your lemon in your cold tea before you stand your tea in the sun :-)I can’t speak to the sugar dissolution problem because it’s outside my experiential comfort zone :-) However, I’ve a vague familiarity with some optical concepts.Let me give you another optical example, one I’m sure you’re familiar with.The resolving power of an optical lens is given by:r = 1.22 Lambda/ D where lambda is the wavelength of light and D is the size of aperture.If Lambda and D are independent, we might speculate a relationship as follows:r ~ 1/D + LambdaUnfortunately, this is clearly wrong because when D= 0 there is still some resolving power. Therefore, r has to comprise an interaction between 1/D and Lambda. In other words, D and Lambda are not independent. Let us now consider the origin of this dependence.When light waves impinge on an aperture, the resulting pattern of light intensity is known as interference, the result of which is to increase the blur size in the image plane. In other words, lambda and D are not independent as we might expect. Once someone chooses a resolving power and wavelength of light, they loose the freedom to choose D, even though there is an obvious button to twiddle. It’s a bit like having five numbers, but only four degrees of freedom with which to calculate standard deviation. Of course one is free to obtain the wrong answer …Does this imply there is a correlation and by implication no cause and effect between Lambda and Aperture? Who knows – it depends on your choice of response.Cheers,
    The Grail Fool

    0
    #171005

    The Grail Fool
    Member

    HF,I thought most people down south put sugar and lemon in their tea before standing it in the sun:-)Anyway, I’ve thought of another example that you might be more familiar with – resolving power, r.r = 1.22 L/D where L = wavelength and D = ApertureIf you want a particular r to identify a tank at 4,000 meters, then you’ll need a fairly large hole, D, in your cupola. Since you can’t choose a wavelength outside a certain band, you’re stuck with the large hole – even though in principle you can select an aperture of any size.The interaction describing the relationship between r and L arises because of interference in the aperture plane and consequential blurring in the image plane; so there is a clear dependency or correlation between D and L, or between r and L.Is this cause and effect? I’d say so provided you’ve studied the relationship using DOE. In fact this is one of the main advantages of DOE – because the method does not rely on unbiased happenstance data.As far as the camera model is concerned it is more clearly seen if you pose an incorrect relationship – one that relies on the independence of D and t.Clearly, the EV does not ~ D + t because when t = 0 the equation predicts a value of EV.This is why we need the multiplicative form that corresponds to an interaction. (Not the same as a reaction.)Cheers …

    0
    #171087

    Vallee
    Participant

    Mark (The Grail Fool)
    Have not ignored the post, just need to do a little research on your example and I am leaving for a facilitation for the next 10 days. I would like to understand the concept giving based on your example before I repsond.
    Thanks

    0
    #171090

    The Grail Fool
    Member

    No problem Chris, I look forward to continuing our discussion.While researching, you might want to take a look at the following article:http://www.geocities.com/jswortham/gunpowder.htmlI should also acknowledge my training under Dr. Taguchi, as I don’t want you to think my arguments are in any way original. I also don’t want to imply I’m sufficiently competent to represent his views.Specifically, Dr. Taguchi often recommends trying to find responses closely related to energy in order to minimize interactions.(Although Taguchi Methods is often not considered part of Six Sigma, it was in Austin. Someone changed Six Sigma and we ought to get back to what worked as quickly as possible. As has been mentioned in this forum previously, early Six Sigma even included much of TPS – hardly surprising given the number of Japanese engineers who came to Austin.)My understanding of Taguchi’s justification for his approach is the question of stability. This is my interpretation – I may get it wrong.If the process can be modeled by a linear equation, such as:y = x1 + x2, where y1 and y2 are independent, then the error is just:dy/dx = dx1/dx + dx2/ dxOn the other hand, if y = x1*x2, then the error will be:dy/dx = x1(dx2/dx) + x2(dx1/dx)So you can see the potential for error (variation) is much greater with dependent variables. (By dependent I mean the level of one determines the level of the other. This is why Taguchi recommends trying to avoid setting up processes with interactions.From a practical point of view, if a piece of equipment has two knobs A and B, there is no knob labeled AB anyway, which is a distinct disadvantage on the shop floor.Good luck with the facilitation ..Cheers,
    TGF

    0
    #171091

    Robert Butler
    Participant

      If you are trying to understand the similarities/differences between Taguchi and “traditional” you might find this old post of mine of some value.
    https://www.isixsigma.com/forum/showmessage.asp?messageID=108236
      Interactions:
     I can’t speak to the mathematics you’ve offered concerning Taguchi’s justification for avoiding interactions but I can offer the following concerning the knobs on the factory floor – there doesn’t have to be a knob labeled AB,  you have control over A and you have control over B and you set them at the levels identified by the regression equation that will take the most advantage of the interaction term.
      One additional thought is this – if Taguchi really doesn’t think there is such a thing as an inteaction then why hasn’t he, or one of his followers, reduced his designs to reflect this?  If you take an inner array and an outer array and multiply the two together you get experimental combinations whose only value is that they will provide you with information concerning interaction terms.
     

    0
    #171093

    The Grail Fool
    Member

    The preference for avoiding interactions in the inner array also applies to main effects and noise factors in the outer array. Unfortunately, a linear combination of process settings and locations within processes is as yet unknown :-)

    0
    #171094

    Robert Butler
    Participant

      I’m not sure we’re talking about the same thing. Let’s take a simple example.
     2 process variables (A,B)and two noise factors (P,Q)each at two levels.
     The inner array is
     A    B
    -1    -1
      1   -1
     -1    1
      1     1
     and the outer array is
    P   Q
    -1   -1
      1  -1
    -1   1
      1   1
    If we multiply these together we get the following
    A   B     P   Q
    -1 -1     -1  -1
    1   -1     -1  -1
    -1   1     -1  -1
     1    1    -1   -1
    -1 -1       1  -1
    1   -1      1  -1
    -1   1      1  -1
     1    1     1   -1
    -1 -1     -1   1
    1   -1     -1  1
    -1   1     -1  1
     1    1    -1  1
    -1 -1     1  1
    1   -1    1  1
    -1   1    1  1
     1    1   1   1
     which is identical to a 4 variable 2 level factorial design.  This means that there are experiments in this design which allow for the investigation of AB, PQ, AP,AQ, BP, BQ, and all of the three and the 4 way interaction as well.
    If you were going to just look at main effects, this design would shrink to the following:
     A  B     P  Q
    -1 -1     -1  -1
     1    1    -1   -1
    1   -1      1  -1
    -1   1      1  -1
    1   -1     -1  1
    -1   1     -1  1
    -1 -1     1  1
     1    1   1   1
      and, you could probably knock this down by a few more if you wanted to run the above through a D-optimal procedure since even this design allows for the investigation of a couple of the two way interactions.
      This same issue applies to other multplications of inner and outer arrays.

    0
    #171098

    Vallee
    Participant

    Robert,
    Maybe I misunderstood his original post. But was he not looking for whether variables could be independent and still have interaction? To put it simpler, you can use a linear regression formula to find an optimal output (the best fit of all weights) but this does not indicate dependency of the x’s. The transformation of one x caused by another however , would indicate dependency of this was required to get the correct output….also thanks for the update on orthogonal rotation in the other post.  Turns out I did remember something form the old stats days.
    HF Chris Vallee
     

    0
    #171104

    The Grail Fool
    Member

    You’re correct, we’re not talking about the same thing, which is an important point we’ll have to come back to later.Your outer/ inner array point is also well taken – I performed the same analysis myself in 1986. Surely, before discussing a model, we must first decide what assumptions have been made about the nature of the process – whether they’re valid or not – the responses, the control factors, the noise factors, and any relationship between factors, before we can make such a decision?Or do you claim the structure you propose is independent of process.In order to clarify the last point, may I suggest we define a process in terms including, but not limited to: chemical components, chemical concentration, mechanical action, time, and temperature.(By mechanical action, I’m referring to physical variation within a process.)

    0
    #171105

    TGF
    Member

    HF,No, my contention is truly independent variables do not interact. The fact that an interaction exists suggest that what we previously thought of as independent, by putting the variables in columns where they ought to be independent, was mistaken.As justification for this I’ve cited Taguchi’s position that interactions by and large depend on our choice of response.For example, if instead of using EV as a response we used log EV, the interaction disappears!I’ll leave it to others to decided whether this is problematic or not.I hope you and Robert are enjoying the discussion. If it’s frustrating – I’ll just go away :-)

    0
    #171107

    Robert Butler
    Participant

      No frustration here TGF I’m enjoying this exchange.  The problem would appear to be that of contention verses mathematical definition.   If I have two columns that meet the requirements for orthogonality and if I multiply the two together I can generate a third column which is orthogonal to and independent of the other two.  This is just a fact of mathematics.
      This kind of behavior is not something that is limited to design matrices.  In vector spaces a cross product of  two orthogonal vectors will generate a new vector which is orthogonal to, and independent of, the original two. The i,j,k unit vectors and their relationships are central to much of physics.
      As for stating that the interaction is dependent on the choice of response I think one should think in terms of the kind of response one gets when one varies variables that are independent of each other.  The relationship of the response to the independent variables will be whatever it is.  In some cases it will be a simple additive linear relationship but in others cases the relationship will be multiplicative.  As I mentioned before, applied chemist are always on the lookout for combinations of independent factors that will result in a multiplicitave as opposed to a simple additive effect and they express their findings in terms that reflect this.
      Shifting subjects: I got the impression from one of your earlier posts that you had studied directly under Taguchi.  If this is the case and if you have run a lot of Taguchi designs in industry I would be interested in your thoughts on the following:
      In my career there have been perhaps a dozen or so times when I was called into look at a process after someone had tried to run a Taguchi design and had come up with not much.  In looking over the actual settings of the design – particularly for the defined noise variables – more often than not I’ve noticed a very poor level of control with respect to the settings for these variables.  By this I mean the following:
      Let’s say I have a noise variable and I have defined its low level to be 5 and its high level to be 15 and I’m only interested in running a two level design with this variable.  What I have seen is that instead of contolling the low level to some reasonably tight region around 5 say 4-6 and the high level to a similar level say 14-16 what I find in the data sheets is that the low level can be anything from say 2-10 and the high can be anything from 12-20.  In other words I have a “fuzzy” low and and “fuzzy” high setting for the variable.  (This lack of control was not due to the fact that the investigators couldn’t achieve a tighter control it was just the fact that they were under the impression they didn’t have to). This, by itself is no big deal as long as one takes this into account when running the analysis. 
      What I have seen in these cases is that the individual who did the analysis substituted -1 for a code for the low setting and +1 for a code for the high setting regardless of the actual low or high value.  If there really is an inteaction present and if you code and analyze your data in this fashion you are almost guaranteed to never detect it.  The proper way to code your data would be to use the actual minimum and maximum settings as -1 and 1 and then scale all of other actual values to something in between.  If your excursions away from the ideal minimum and maximum are extreme enough you may very well find that the intaction term for the actual X matrix is either not measureable(i.e. it is not independent of the two terms used to generate it) or it is looking for an interaction over a much smaller region than the region you examined and in that region it is not significant.
      My question is this – is this lack of control with respect to a noise variable part of the Taguchi philosophy or is it just a case of a biased sample in terms of the few cases I’ve seen.  If it is part of the operating philosophy I could easily understand how a person could come to the conclusion that interactions don’t exist.

    0
    #171108

    TGF
    Member

    Robert,I’ve only time to address your first comment today. I’ll give the other some thought tomorrow.Yes, I agree, but the origin of the problem is not the orthogonality of the model, it is the assignment of variables that were thought were independent, but are not.If variables are truly orthogonal the magnitude of the interaction vector would be equal to the null vector, or as close to null as could be expected with experimental error.Another simple fact of mathematics is if a* b = y then by implication a = 1/b y and a depends on b, even though y is orthogonal to both a and b. If this is true, then clearly a is not orthogonal to b …This why some procedures use Gram Schmidt orthogonalization .. to ensure all variables are orthogonal, including cross terms.Before we digress to your other comment, I’d like to come back to the justification for taking a factor in the outer array and putting it in the inner array. (Pf course, excluding your rather excellent example of how to convert humidly from an uncontrollable, noise variable into a quasi-controllable variable.)As soon as we discuss that I’d be happy to discuss your other important observation. TGF

    0
    #171114

    Vallee
    Participant

    Going with Robert’s mathematical perspective, I would like to see what your true definitions of dependent and independent variables are? Because if I remove the output parameter, the x variables you described in numerous posts do not correlate to each other and have no affect on each other when paired. The dependency then comes from its relation to the output not the relation to each other.

    0
    #171125

    TGF
    Member

    Agreed …The dependency of x2 on x1 and vice versa in the interaction x1*x2 is by virtue of their relationship with the response, y.If we change the response y, then the interaction dependency may well disappear, such as when I transform y using the log transformation.My previous comment to Robert about ‘structure’ was directed towards the interactions between inner array factors and factors in the outer array. On the surface, this practice gives a false impression of inefficiency and Robert is quite correct to question it.By transferring a factor from the outer array to the inner array, something has been assumed about the process.

    0
    #171132

    Robert Butler
    Participant

      The only relationship between an independent X1 and X2 and a dependent Y is the fact that (assuming cause and effect) Y is a result of changes in X1 and X2 whereas if I change Y its change may or may not have anything to do with X1 and X2 – this is the whole reason for randomization of experimental runs.  If you try to couch the discussion in the form that X1*X2 = Y is the same as X1 = Y/X2  and thus this means X1 and X2 aren’t independent then you have the same problem if you say
    X1 +X2 = Y since X1 = Y-X2
      As for tranforms, such as the log, rendernig an interaction insignificant – I can imagine situations where this could happen, however, this isn’t what I normally see.  What happen is the following: I will run stepwise and backward elimiation, identify a possible final model, run the regression diagnostics and find that the residual plots indicate I need to do something to the data to better insure an accounting of the signal present in the system. 
       If I get a shape in the pattern the first choice is going to be a log of the Y or the X or perhaps both.  If I do this and the residual pattern changes to a picture of a matzo ball then I will be satisfied that the transform was appropriate.  However when this done, the coefficients will change but, in general, the significant terms remain.  Which means if the interaction was present without the transform it is still there after the transformation.
      One thought – is there any chance this issue is nothing more than the choice of the word “interaction”?  If this is the case then we can always call the effect what it really is – a multiplicative effect of two independent variables.  If this isn’t the case then it would appear that Taguchi’s position is that the only effect two independent variables can have on any outcome is that of simple addition… and that is contrary to what is known about a large number of physical and chemical processes.
      This also brings up the other question I had.  In the post I wrote about the differences between Taguchi and traditional I quoted the one poster who claimed Taguchi said “When there is no interactions among control factors, DOE is a waste – you can use OFAT and get the same result.” 
       As I said at the time I hadn’t gone back to check this and I must admit I still haven’t but since you are a practioner of the Taguchi methods could you check your books and see if a) Taguchi does indeed say this and b) if it has been taken out of context and thus does not really mean what the stand alone sentence implies.

    0
    #171137

    TGF
    Member

    I thought the whole point about y = X1 + x1 is X1 can be adjusted to a target without having to re-set X2, and vice versa. By way of contrast, setting an interactions, seems much more confusing, because there isn’t an X1*X2 knob on process equipment.Thank you for admitting a modification of the response can minimize an interaction effect in some circumstances – I’ve seen many situations where it does works. (Perhaps as many as 1000 experiments.)Many companies in Japan, including Toyota, Hitachi, Toshiba, and Fujifilm use Taguchi Methods with great success.As for your last comment, I can confirm he holds this view, but I have no idea how to argue this in his behalf, so I’m not even going to try.Neither will I argue his lack of randomization.Although I’m willing to try and justify his use of an inner and outer array.TGF

    0
    #171194

    Robert Butler
    Participant

        Let’s try for a recap.
     
      Assumptions:
    1.      The accepted mathematical definition of independence is to be replaced with a non-mathematical conjecture that somehow Y needs to be part of the definition of independence between two X variables.
    2.      Interactions are an artifact that can be made to disappear by running some kind of transform on the Y’s or the X’s.
     
      This would suggest the following:
     
    I have a standard 2 variable 2 level design in the following form
     
    A        B        AB
     
    -1       -1         1
     1        -1       -1
    -1         1       -1
    1         1        1
     
    I apply the Taguchi Transform and the AB column vanishes, which means any effect I might attribute to AB vanishes and is somehow folded into the effect of A and B.
     
    Next I decide to run a Taguchi design.  I have 3 process variables and I choose the standard 2 level fractional factorial for these 3 variables.  The matrix is as follows:
     
    A        B        C
     
    -1       -1         1
     1        -1       -1
    -1         1       -1
    1         1        1
     
    Therefore when I apply the Taguchi Transform the effect of C disappears and is folded into the effect of A and B. Whether I was turning knobs or not is irrelevant – the effect of C is computed in exactly the same way that the AB effect is determined in the first matrix.
     
      If it were true that a transform would always obliterate a column representing an interaction, and thus its effect, itwould mean Taguchi could only use full factorial designs to build his inner and outer arrays since fractionation automatically confounds one or more main effect variables with one or more interaction terms.
     
    I appreciate that Taguchi believes he can make interactions disappear and I would be willing to believe there might be occasions when this could happen (just as I would believe you might be able to make a main effect disappear).  
     
     With experimental design data I also think these occasions are rare – if it was commonplace no one would have wasted all of the effort that has been and is being expended on interactions in the world of statistics and the concept of an interaction would be nothing more than a simple footnote in advanced books on experimental design. 
     
     Given the number of times I’ve had to transform data due to the dictates of the residual plots I’m also quite certain I would have noticed this stated effect on interaction terms.  Admittedly, I haven’t run 1000 experimental designs – I’ve probably averaged around a design a month during my career which would translate into something in the region of 250 – but still, most of them have involved lots of interactions – and I must admit I can’t recall ever seeing this effect.
     
    Rather than belabor the point – what would be of value would be a real example.  Given the number of experiments you have run is there any chance you could provide the raw data from a design where this effect can be observed?
     
    The above should not be interpreted to mean I think Taguchi designs are worthless. I know various and sundry people have had successes using them.  Given that they are just modifications of proven, existing designs this isn’t particularly surprising and, as I mentioned in the post contrasting the two, I don’t care whether you choose to run one or not.  My only concern would be that you base your choice on an understanding of the mathematics and the real similarities and differences and not on some misapprehension concerning something such as the confusion surrounding the word “noise”.

    0
    #171195

    Robert Sutter
    Member

    DOE is an acronym for Design of Experiments. This is a process improvement method where key input variables from the Analyze phase are manipulated in a controlled experiment. The purpose of the experiment is usually to optimize the settings of the key input variables. When a DOE is conducted the statistical analysis of the data can be used to infer causality.
    Regression analysis can be used to analyze the data collected during a DOE. However, regression analysis can also be conducted on data that is not collected from a DOE. In this instance inferences about causality cannot be made.

    0
    #171197

    BTDT
    Participant

    Robert:A great thread.I’d like to make it obvious that what people using the term “historical DOE” are, in fact, doing regression and not DOE.Given that, they should note all the caveats in the above thread.Cheers, Alastair

    0
    #171207

    TGF
    Member

    I’m am certainly willing to provide some data, provided I’m satisfied you understand the process. This seems perfectly reasonable because the last thing I want is someone running around half-cocked!My justification for caution is as follows:There are many practitioners in fields that do not encounter certain problems. In this sense, practitioners are truly the product of their environment.For example, in the manufacture of mechanical components, raw materials do not vary as much as they do, for example, in the manufacture of rubber seals. (In some mechanical processes you can run a first off and produce 5,000 components before seeing burrs. Unfortunately, I know nothing about the aircraft industry and defer to you.In some industries the ability to adjust a process is critical – even when you have assay for each batch and physical properties, which is why Taguchi places a lot of emphasis on the additivity of effects.In the semiconductor industry, process engineers by and large have an excellent understanding of how to target the process average; what they don’t tend to have is a good handle on is variation within a run. If you have this sort of background – all well and good, but you can hardly blame me for wanting to calibrate our world view, especially as some author’s examples of Taguchi methods have been puerile to say the least – so I can certainly understand your skepticism.On the other hand, several thousand case studies have already been published in the proceedings of the ASI. Yet more are in Crevelling’s and other books. (By the way, I have no connection with ASI and I’m not seeking to promote their views. On the contrary, I’d like to see traditional DOE adopt some of Taguchi’s ideas, which is why I’m visiting the forum. Neither do I consider myself a practitioner of Taguchi Methods, although I’ve used it extensively in the past. I believe some of Taguchi’s ideas are sufficiently interesting to bring them to the attention of members of the forum. By the same token, there is much about Taguchi Methods I don’t agree with, or do not understand. In this spirit, I’m willing to continue our discussion.Of the thousand or so experiments I referred to, some were undertaken by myself, others were conducted by other process engineers at Motorola. Every one of the experiments used a log transform, but since many authors have misrepresented Taguchi Methods, or have used too simplistic examples, which would be better served by traditional DOE, I’m not surprised you’re skeptical, which is why thought to provide a better explanation of Taguchi’s work – bearing in mind my own limitations.(By the way, I’ve provided information concerning a substantial portion of the 1000 or so experiments to a member of this forum to check independently.)Please be aware there are those among us who would seek not only to promote themselves, but at the same time undermine the efforts of others. As a consequence there is little information available about the large number of engineers trained in Taguchi Methods prior to 1988, and the large number of Japanese engineers and managers who came to Motorola to help them achieve world-class performance.Coming back to your example 3 factor example, I don’t see why if the AB interaction is zero, why the estimate of the main effect C would be ‘folded over’ or aliased with A, or B, or with both.I’m also perceive a difference in our approach – I would like knobs to be independent, while you seem to place more emphasis on columns in a factorial.Anyway, before we speculate further, isn’t it important to first establish the response/s?Unfortunately, before we can discuss the response we need to calibrate our process engineering objectives, and part of that discussion must be the difference between the inner array and the outer array. I keep stressing this because I don’t think it is what you expect :-)

    0
    #171208

    TGF
    Member

    I’m am certainly willing to provide some data, provided I’m satisfied you understand the process. This seems perfectly reasonable because the last thing I want is someone running around half-cocked!My justification for caution is as follows:There are many practitioners in fields that do not encounter certain problems. In this sense, practitioners are truly the product of their environment.For example, in the manufacture of mechanical components, raw materials do not vary as much as they do, for example, in the manufacture of rubber seals. (In some mechanical processes you can run a first off and produce 5,000 components before seeing burrs. Unfortunately, I know nothing about the aircraft industry and defer to you.In some industries the ability to adjust a process is critical – even when you have assay for each batch and physical properties, which is why Taguchi places a lot of emphasis on the additivity of effects.In the semiconductor industry, process engineers by and large have an excellent understanding of how to target the process average; what they don’t tend to have is a good handle on is variation within a run. If you have this sort of background – all well and good, but you can hardly blame me for wanting to calibrate our world view, especially as some author’s examples of Taguchi methods have been puerile to say the least – so I can certainly understand your skepticism.On the other hand, several thousand case studies have already been published in the proceedings of the ASI. Yet more are in Crevelling’s and other books. (By the way, I have no connection with ASI and I’m not seeking to promote their views. On the contrary, I’d like to see traditional DOE adopt some of Taguchi’s ideas, which is why I’m visiting the forum. Neither do I consider myself a practitioner of Taguchi Methods, although I’ve used it extensively in the past. I believe some of Taguchi’s ideas are sufficiently interesting to bring them to the attention of members of the forum. By the same token, there is much about Taguchi Methods I don’t agree with, or do not understand. In this spirit, I’m willing to continue our discussion.Of the thousand or so experiments I referred to, some were undertaken by myself, others were conducted by other process engineers at Motorola. Every one of the experiments used a log transform, but since many authors have misrepresented Taguchi Methods, or have used too simplistic examples, which would be better served by traditional DOE, I’m not surprised you’re skeptical, which is why thought to provide a better explanation of Taguchi’s work – bearing in mind my own limitations.(By the way, I’ve provided information concerning a substantial portion of the 1000 or so experiments to a member of this forum to check independently.)Please be aware there are those among us who would seek not only to promote themselves, but at the same time undermine the efforts of others. As a consequence there is little information available about the large number of engineers trained in Taguchi Methods prior to 1988, and the large number of Japanese engineers and managers who came to Motorola to help them achieve world-class performance.Coming back to your example 3 factor example, I don’t see why if the AB interaction is zero, why the estimate of the main effect C would be ‘folded over’ or aliased with A, or B, or with both.I’m also perceive a difference in our approach – I would like knobs to be independent, while you seem to place more emphasis on columns in a factorial.Anyway, before we speculate further, isn’t it important to first establish the response/s?Unfortunately, before we can discuss the response we need to calibrate our process engineering objectives, and part of that discussion must be the difference between the inner array and the outer array. I keep stressing this because I don’t think it is what you expect :-)

    0
    #171213

    Robert Butler
    Participant

      I think I see the disconnect and if I’m right then what we are dealing with is nothing more than a issue of words. 
     
      Before I say anything about that let me make one last statement about independence.  I’ve been trying to think of ways to explain the idea and I guess I’ve failed.  About all I can say is that for the matrix  A,B and C or A,B and AB are independent and knobs to turn don’t have anything to do with it.  I would urge you to take the time to understand this idea. If you don’t you will run the risk of misinterpreting a great many things with respect to the concepts of experimental design.
     
    Here’s where I think the disconnect may have occurred:
     
      You said that at Motorola everyone is using logs and building regression equations.
     
    Is there any chance the final models look like the following?
     
    #1    Log Y = logX1 +logX2 ?
     
    If the regression equations are in this form then the back transform of the equation is
     
    Y  = X1*X2
     
    And the linear form is nothing but an interaction.  
     
     
      The kind of equations I’m talking about would start out looking like
     
    Y  = b0 +b1*X1 +b2*X1*X2
     
      When I examined the residuals I would find the usual shape.  Under these circumstances a good first choice is to look at a log transform of the Y and rerun the regression.  If the Y was the culprit then the residual plot looks like a shotgun blast and the final model is
     
    Log Y = b0 +b1*X1 +b2*X1*X2.
     
     
     When I asked for an example what I was looking for was a design matrix and a response that when modeled gave a form like
     
    Y  = b0 +b1*X1 +b2*X1*X2
     
    And that when I took a log the model became
     
    LogY  = b0 +b1*X1 +b2*X2
     
      If the log transform model you considering is that of #1 above then there we have resolved the problem – you haven’t eliminated an interaction you have just taken its log..
     
    Finally, when you are building a design all you need is evidence that the variables you are interested in changing may have an impact on the response you are trying to measure.  What you need to know are the variables you are interested in checking and what limitations may apply to them with regard to their combinations.

    0
    #171227

    TGF
    Member

    Yes, there is a disconnect, and I don’t hold any hope of being able to discuss any other of Taguchi’s ideas with you.I never stated: ” You said that at Motorola everyone is using logs and building regression equations.”My reference was to the 1,000 or so experiments, a portion of which were run more than twenty years ago! I have no idea what they do now – considering their failures, I can only assume they’ve gone back to TQM.In fact, in 1987 or so there were two facilities in Austin, one of which was a research facility. The research facility development was under the guidance of an eminent statistician, further supported by a Prof. at Texas University and a statistician based in California.The other team based in a production facility comprised three process engineers trained in Taguchi Methods, and traditional statistics, and a group of TPM operators. (At that time over 50 process and device engineers had been trained in Taguchi Methods, and traditional statistics.By chance, these groups both developed photo resist processes for 0.8 micron product. (Laughable these days, but back then quite a challenge.)In the end, the research facility had to transfer the photo process from the production facility because their process didn’t cut the mustard.I’m now going to ask the modulator to verify my statement. If he can’t, I won’t post again.If he does, then I’d like to ask you to be more open minded – and actually respond to some of my questions – we haven’t even discussed the goals of the process, or the responses.TGF

    0
    #171228

    TGF
    Member

    Yes, there is a disconnect, and I don’t hold any hope of being able to discuss any other of Taguchi’s ideas with you.I never stated: ” You said that at Motorola everyone is using logs and building regression equations.”My reference was to the 1,000 or so experiments, a portion of which were run more than twenty years ago! I have no idea what they do now – considering their failures, I can only assume they’ve gone back to TQM.In fact, in 1987 or so there were two facilities in Austin, one of which was a research facility. The research facility development was under the guidance of an eminent statistician, further supported by a Prof. at Texas University and a statistician based in California.The other team based in a production facility comprised three process engineers trained in Taguchi Methods, and traditional statistics, and a group of TPM operators. (At that time over 50 process and device engineers had been trained in Taguchi Methods, and traditional statistics.By chance, these groups both developed photo resist processes for 0.8 micron product. (Laughable these days, but back then quite a challenge.)In the end, the research facility had to transfer the photo process from the production facility because their process didn’t cut the mustard.I’m now going to ask the modulator to verify my statement. If he can’t, I won’t post again.If he does, then I’d like to ask you to be more open minded – and actually respond to some of my questions – we haven’t even discussed the goals of the process, or the responses.TGF

    0
    #171237

    annon
    Participant

    RB,
    Just wanted to thank you for continuing to post.  I continue to learn a great deal from your insights.  Thanks for the knowledge sharing.

    0
    #171239

    Robert Sutter
    Member

    Actually you can evaluate multicollinearity in regression using variance inflation factors. VIFs > 10 indicate multicollinearity.

    0
    #171240

    Robert Butler
    Participant

    You said:
     
    “Of the thousand or so experiments I referred to, some were undertaken by myself, others were conducted by other process engineers at Motorola. Every one of the experiments used a log transform”
     
    …and that is why I said
     
    “You said that at Motorola everyone is using logs and building regression equations.
     
    Is there any chance the final models look like the following?
     
    #1    Log Y = logX1 +logX2.”
     
     I appreciate this was something of an extrapolation but if I’m transforming all of my X’s and my Y’s and attempting to understand the process it is highly likely that I will do more than just log the data.  If I want to build a model for the data I will most likely use linear regression and if I do the models will be in the form of #1 above.
     
    With regard to your last comment –  “I I’d like to ask you to be more open minded – and actually respond to some of my questions – we haven’t even discussed the goals of the process, or the responses.” 
     
    The problem is that the last piece, which has been a recurrent theme in your posts, is directly connected to the issue of understanding the concept of independence.  I will reiterate – the issues concerning the construction of a design focus on the independence of the X’s and the independence of the terms of those X’s (curvilinear, interactions) that you as an investigator might want to check relative to their effect on the process and/or the response.  When you are considering the levels of the X’s the only question you need to ask is if certain combinations will be difficult or impossible to run (e.g. you can’t vary the temperature with every experiment because the thermal mass of the equipment won’t permit it, you can’t combine those chemicals at those levels because they will never form a solid, etc.)  Once you have a design that meets these requirements the only question left is that of time and money.  If the proposed design violates either of these then you make additional modifications to meet these requirements.
     
    The response to a design will be whatever it will be.  If you have done your homework prior to constructing the design there is a good chance that you will find relationships between the response and the things you varied – what those relationships will be is unknown – that’s why you are running the experimental design.  Once you have the information from the design in hand you can apply that information in ways that allow you to achieve the goals of the process and the desired characteristics of the responses.  
     
    The only way you can bring the goals of the process or the responses into a consideration of choices of the levels of the variables in the design or indeed of the design itself is if you ignore the mathematical definition of independence.  Under these circumstances being open minded is tantamount to accepting your definition of independence and discarding the definition of independence upon which all experimental designs, including Taguchi’s, are not only built but also analyzed.

    0
Viewing 70 posts - 1 through 70 (of 70 total)

The forum ‘General’ is closed to new topics and replies.