how is DOE different from Regression
Six Sigma – iSixSigma › Forums › Old Forums › General › how is DOE different from Regression
 This topic has 69 replies, 16 voices, and was last updated 14 years, 3 months ago by Robert Butler.

AuthorPosts

April 8, 2008 at 3:32 am #49796
Just like the subject says….help please!
0April 8, 2008 at 5:21 am #170758Franky,
DOE in the form we use has regression embedded in it. DOE is used to generate data in a most efficient fashion. You perform regression on this data to get transfer function.
This is the crux..rest is detail.
ravi Pandey0April 8, 2008 at 11:51 am #170767
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.DOE is nothing more than rules for gathering data. Regression is one of the tools you can use to analyze data thus gathered. The value of the rules of DOE vs. say happenstance is that data gathered using the rules of DOE provides certain guarantees with respect to the variables you would like to treat as being independent of one another.
0April 8, 2008 at 12:11 pm #170768Robert,What is the difference between independent and interacting variables.Cheers,
Mark0April 8, 2008 at 1:02 pm #170770DOE needs to follow regression. In regression you check whether certain X’s and combination thereof has an impact on Y and how significant is the relationship. In DOE you will take those X’s and see how much does this X’s has impact on Y and through response optimazation you determine the best possible combinations. My two cents.
Also in regression both X and Y are continous however in DOE mostly your factor would be discrete and Y discrete or continous.
In regression you cannot check for multicollinearity but in DOE you can…0April 8, 2008 at 1:28 pm #170771SS,Did you mean to reply to my post to Robert? If you did, I don’t see how your post answers my question.Mark
0April 8, 2008 at 2:25 pm #170772
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.DOE is a method for gathering data. It is not connected to or driven by regression in any way. Regression is one tool you can use to analyze the data you gathered using the method of DOE.
Because it is a series of rules for gathering data DOE doesn’t check anything. When you build a DOE you construct it so that when the data is gathered and analyzed the X’s of interest will not exhibit collinearity but in order to test for multicollinearity (for example, in the case when one or more of the experiments failed to run) you will need to use regression and the methods of regression diagnostics to test the degree of multicollineaity (confounding) and whether or not it is an issue.
The rule concerning the need for X and Y to be continuous before running a regression is a very conservative piece of boiler plate that is taught in intro BB courses to make sure you don’t many any egregious mistakes when you are starting out. In fact X and Y can be either discrete or continuous but you have to know what to do if they are. The posts below provide some information in this regard.
https://www.isixsigma.com/forum/showmessage.asp?messageID=59278
https://www.isixsigma.com/forum/showmessage.asp?messageID=109369
0April 8, 2008 at 2:58 pm #170775
Outlier, MDSBParticipant@Outlier,MDSB Include @Outlier,MDSB in your post and this person will
be notified via email.SS,
Some of what you said is true, but some of it is either wrong or misleading. Your first paragraph is essentially correct.
Be aware that there is a subtle but important difference between “multicollinearity” and “interactions.” Two inputs can interact to have a greater impact on Y but still be independent of each other.
Your second paragraph is misleading. In Linear (and Multiple Linear) Regression, both X and Y are continuous. There is also Logistic Regression in which your Y is discrete and your X can be discrete or continuous. In DOE you can have either discrete or continuous inputs or outputs. I see many DOE’s with only continuous factors.
Regarding your last sentence, you can absolutley check for multicollinearity with multiple regression, in fact you MUST check for it to insure your regression model is valid. With DOE, when you set up your design structure, it is the orthogonal design that insures independence of your variables.
Some other difference between DOE and Regression:
Regression uses “passive” data, meaning you are analyzing data after it was collected. With DOE your data collection is “active” meaning you plan how and when you will collect your data in advance.
With DOE you have blocking and other noise treatments to insure the signal you see comes from your factors. With Regression you have limited noise control due to the passive nature of the data collection.
With DOE you have more control over the measurement system accuracy than with regression.
Both methods have a goal of creating a predictive equation.
Both methods could have “lurking variables” meaning some outside factor that you haven’t controlled for is impacting your Y.
Hope that was helpful.
0April 8, 2008 at 4:28 pm #170776
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.When you say “interacting variables” I’m assuming you mean an interaction involving two or more variables. If this is the case then there isn’t anything to differentiate. If you have two variables and you express their settings for each experiment in matrix form then the interaction is just the multiplication of the two columns representing the settings of the two variables. This column, like the column for each of the two variables, identifies the way in which you are to use the data from the individual experiments to determine the size of the effect of the interaction.
The interaction may be independent of the two variables of interest or not whether it is will depend on the degree of orthogonality between the two variables of interest.
For example if we have 2 variables and their settings for 4 experiments are
Exp A B AB
1 1 1 1
2 1 1 1
3 1 1 1
4 1 1 1
Then the diagnostics indicate that A and B are orthogonal and thus so is AB relative to A and B. This independence means all three terms can be use in the model. On the other hand if the matrix is
Exp A B AB
1 1 1 1
2 1 1 1
3 1 1 1
4 1 1 1
Then A and B are not perfectly orthogonal and the diagnostics indicate AB = intercept A + B so AB is not independent of A and B. A and B are “reasonably” independent of one another so a model could be examined which only had A and B as “independent” terms.0April 8, 2008 at 7:56 pm #170795Sometmes the best answer is the simplest answer:
Regression is the study of data taken from past performance ,historical data, A DOE is a planned activity with predefined combinations of factors that must be followed in a prescribed random manner.
Regression is easy and cost effective but limited in its ability to define the key factors,
A DOE is designed to determine the key factors.0April 8, 2008 at 8:28 pm #170798Robert,The point I’m trying to understand is whether or not two variables can be both independent and interacting. By interacting I mean the variables do not have a linear relationship.PS: You might have already explained my question in terms of the matrices, but I’d find it easier to understand if you could answer my question using the terms I’ve used.Thanks,
Mark0April 8, 2008 at 9:13 pm #170799Outlier—
that was a great explanation…thank you0April 8, 2008 at 9:19 pm #170800
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.Hmmmm…. I guess I’ll need some more details. If interacting means “two variables do not have a linear relationship” I need to know what you mean by this statement.
Let’s say we have Y and X1 and X2 where we want to look at the relationship Y = some function of X1 and X2. Then your statement could mean any of the following:
1. X1 and X2 are related either in a linear fashion so that
X1 = b0 +b1*X2 . If this is the case then they are linearly related and they are not independent.
2. X1 and X2 are related in some curvilinear way so that
X1 = b0 +b1*X1*X1. If this is the case then they are not linearly related (that is the relationship is not a straight line) and they are not independent.
3. X1 and X2 are not some function, regardless of type, of one another and say Y = b0 + b1*X1*X1 +b2*X2*X2. Then X1 and X2 are independent and their relationship with Y is not a straight line – that is it is not linear.0April 8, 2008 at 9:41 pm #170801Robert, Can you help me to understand the meaning of orthogonality (as in “degree of orthogonality”) Thank you,Srini
0April 9, 2008 at 10:59 am #170831Thank you for your patience Robert …What I meant by linear is a linear combination of x1 and x2, e.g:y = x1 + x2If on the other hand x1 = x2, then the equation above reduces toy = X2^2 and y is not linear.———I’m now curious about the interaction term: y = x1*x2, which I believe by the same thinking is nonlinear.My question is can x1*x2 exist when x1 and x2 are completely independent – in other words when they are completely orthogonal?I’ve thought about this on and off for several years, but the only case I can think of is the unrealistic mathematical case when on the interaction diagram of y = x1*x2, x2 at level 1 is parallel to the x1 axis and x2 at level 2 is vertical to he y axis.This seems to me to be the only truly independent case for the interaction term x1*x2 and unrepresentative of any process I’ve ever seen.If my thinking is correct, then the use of an interaction column is an admission that some variables are probably not truly orthogonal, and of course they should be :)What do you think?Cheers,
Mark0April 9, 2008 at 12:40 pm #170835
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.If X1 = X2 then the equation y = X1 + X2 would reduce to y = 2*X1 or y = 2*X2 not y = X2^2.
As for an interaction being nonlinear and independent first, if we are talking about a case where X1 and X2 have only two levels and are completely independent (that is orthogonal) then the interaction will still be linear and the interaction will be independent of X1 and X2. If the variables have 3 levels then the relationships could be either linear or curvilinear.
You will have to think in terms of matrix arrays in order to understand the issues concerning indepndence (and orthogonality) of the three (X1, X2, and X1X2). I addressed this in an earlier post to this thread.
https://www.isixsigma.com/forum/showmessage.asp?messageID=139371
Wirh respect to orthogonality you should note that two terms are orthogonal to one another if one column cannot be expressed as a linear combination of the other columns in the array. In second the example in the cited post the way you do the addition for the array to show the lack of independence of AB relative to A and B is the following:
Exp A B (intercept A +B) AB
1 1 1 1 (1) +(1) = 1
2 1 1 1 (+1) +(1) = 1
3 1 1 1 (1) +(1 )= 1
4 1 1 1 (+1) +(+1) = 10April 9, 2008 at 1:00 pm #170838Thanks for correcting my mistaken thinking. I can now see that an interaction can be both linear and independent.However. I’m still left with the difficulty of picturing an independent interaction.Do you agree the independent interaction y = x1*x2, would have to have X2 = low, parallel to the x1 axis, and x2 = high, parallel to the y axis?In the meantime I’ll study your other post.Cheers,
Mark0April 9, 2008 at 1:37 pm #170840
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.No, Rather than try to describe this just plot the following:
A B Y
1 1 3
1 1 4
1 1 6
1 1 12
plot Y against A where you have held B constant (you should have 2 lines one for B = 1 and one for B = 1) note that the two lines are not parallel to one another. The lack of parallel lines when you shift from a low to a high B value is a graphical indication of the existence of an interaction. If you test these numbers in Minitab (toss in a 5th point
1,1, 2.9 just so Minitab will have some error to work with) and you will see that A, B, and AB are significant.0April 9, 2008 at 1:58 pm #170841
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.If we use the posts to this thread as a measure then there seems to be a lot of confusion concerning DOE and regression. All of the statements below have been offered in this discussion and all of them are either in error or misleading.
Also in regression both X and Y are continous however in DOE mostly your factor would be discrete and Y discrete or continous.
In regression you cannot check for multicollinearity but in DOE you can
Regression uses “passive” data, meaning you are analyzing data after it was collected. With DOE your data collection is “active” meaning you plan how and when you will collect your data in advance.
With DOE you have blocking and other noise treatments to insure the signal you see comes from your factors. With Regression you have limited noise control due to the passive nature of the data collection.
With DOE you have more control over the measurement system accuracy than with regression.
Regression is the study of data taken from past performance ,historical data
Regression is easy and cost effective but limited in its ability to define the key factors
As Ive already said DOE is a method for data gathering. Regression is a method of data analysis.
Some Citations
The design of an experiment has been defined very simply as the order in which an experiment is run Care will be taken to emphasize three important phases of every project: the experiment, the design, and the analysis Hicks Fundamental Concepts in the Design of Experiments 2nd Edition.
Note the distinction between design and analysis in the above.
Factorial Designs In this field the design and the analysis of the results are closely linked. Davies – The Design and Analysis of Industrial Experiments 2nd Edition
Note again the distinction and also note the distinction in the title of the book.
Regression is a series of algebraic manipulations that can be applied to any kind of data. This includes happenstance data that is data that was not gathered following the rules of a design, and design data data that was gathered following the rules of a DOE.
Regression does not control anything. Regression cannot control noise but regression analysis can quantify the amount of noise in a block of data. The purpose of regression is to quantify the correlations between one or more variables identified as responses and one or more variables identified as predictors of those responses.
This may seem like so much statistical nitpicking but lack of clarity with respect to understanding the function of the tools can lead to concluding things that are completely wrong. Specifically, on a number of occasions Ive been brought into an effort and done all of the usual things one is supposed to do in the Define and Measure phase. After careful analysis Ive reached the point where I thought a design would be appropriate. I make the proposal for the design and I’m met with a complete refusal. When I ask why I’m told – “We tried that design analysis stuff once. We ran one and it didn’t identify anything that was any real improvement over what we currently have.” When I ask to see the analysis of the design the response is along these lines – “The design was the analysis and it didn’t show anything.” When pressed further they will tell me that after they ran the design they looked at the results from the experiments and since none of the experiments run in the design was any kind of an optimum/improvement they concluded that the design had failed. What they failed to understand was that the design only prescribed the way in which the data was gathered their analysis was just the eyeballing of the data and I would agree that that analysis failed to extract the information buried in the data.
From time to time I find that the data from the design is still available. Under those circumstances I request permission to actually analyze the data generated by following the rules of the design and every time I was able to use the resultant predictive equations to either identify an optimum or point in the direction where the optimum was most likely to be found. On more than one occasion it has turned out that all of the work on the problem in question had already been done – it was just that no one understood that they had to analyze the data from the design in order to extract the information it contained.0April 9, 2008 at 2:00 pm #170842
Outlier, MDSBParticipant@Outlier,MDSB Include @Outlier,MDSB in your post and this person will
be notified via email.Mark,
To me the best way to think about the concept of independent variables interacting is to try to think of a real, easy to understand example.
Say you were conducting an experiment to determine how to get the best fuel mileage from your car. Y is your MPG and is continuous data. Let’s say that you have found that the speed that you drive is one factor (X1) and tire pressure is another factor (X2). It is pretty easy to wrap your brain around a concept that speed and tire pressure are independent of each other. (Forget for a moment that at highway speeds, tires heat up and that raises the pressure. Assume that the pressure has already normalized.) So we can agree that changing tire pressure does not change speed and speed does not change tire pressure (after the temperature normalizes). But it is completely possible that the right combination of speed and tire pressure can have a greater impact on Y than either factor alone.
That may not be a perfect example, but it should help you more clearly understand how variables can be independent and still interact to affect the output.0April 9, 2008 at 2:10 pm #170843
Outlier, MDSBParticipant@Outlier,MDSB Include @Outlier,MDSB in your post and this person will
be notified via email.Well said.
DOE is a method of planned data gathering. Regression is a method of data analysis.
I recognize some of my statements as being misleading, probably because I oversimplified. I guess that’s the risk of trying to boil down whole fields of study into a few bullet points.0April 9, 2008 at 7:36 pm #170855Great answer !! sums it all up.
you are more likely to use regression analysis in Analyze phase and DOE in Improve phase… I hope I am not oversimplying there either ??0April 9, 2008 at 8:19 pm #170857
Iain HastingsParticipant@IainHastings Include @IainHastings in your post and this person will
be notified via email.Anoop,
Ron’s answer is not a great answer and it does not sum things up accurately or appropriately. Similarly your statement is wrong and confused.
I strongly recommend that both Ron and you take the time to read Robert Butler’s comments on this thread. He has spent considerable effort trying to remove confusion, unfortunately it appears to have been with limited success.0April 9, 2008 at 8:34 pm #170858“you are more likely to use regression analysis in Analyze phase and DOE in Improve phase… I hope I am not oversimplying there either ??”
Why is this incorrect?? I am asking out of genuine curiosity or perhaps confusion…please elaborate. I did read through the previous answers and I still feel the same.0April 9, 2008 at 8:44 pm #170860
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.This modification of your sentence would be reasonable.
“you are xxxx likely to use regression analysis in the define and measure phase when examining historical data. In the Analyze phase you may find it appropriate to use DOE and when you do you may use regression to analyze the data gathered under these conditions. The use of a DOE in Improve phase may be necessary to confirm your recommendations before full implementation. Once again, you will probably use regression methods to analyze the data from the DOE. … I hope I am not oversimplying there either ??”0April 9, 2008 at 9:03 pm #170861Thanks Robert.
That makes sense. I think it depends on the kind of project but I see what you are saying.
– Anoop0April 10, 2008 at 7:05 am #170904Once again thank you for your patience …I plotted the data you posed by hand and confirmed the presence of an interaction – nonparallel curves y1 and y2.Just to clarify, the two curves y1 and y2 correspond to the two levels of x2.Previously, I referred to these two curves and noted as y1 goes from y = 3 to y = 5, y2 goes from y = 4 to y = 12.This seems to imply the two curves y1 and y2 are highly correlated?Therefore y1 and y2 are not independent.In other words, for y1 and y2 to exist it is necessary to have some dependency between x1 and x2.Therefore, it follows if x1 and x2 are truly independent, as they should be: there is no need for a x1*x2 term or column, and we can use saturated designs.Cheers,
Mark0April 10, 2008 at 7:28 am #170909Hi,Thank you for your participation …I understand your example, but do not see how you can assume independence if there is an interaction.Imagine a black box: the box has two knobs on the facia – one labeled A and another labeled B.If A and B are independent, I can adjust the process perfectly well using either A or B, or an independent combination thereof.
If A and B are not independent though, I have to be very careful how I adjust A because the setting of A demands a certain setting of B, due to the AB interaction.What is a reasonable man conclude? The presence of the AB interaction implies a dependency between the A knob on the B knob.Mark0April 10, 2008 at 7:38 am #170912Outlier,Something weird happened to my earlier reply to your kind post.I’ll do my best to remember my earlier reply in case the first one appears and there is some inconsistency in argument :)Imagine a black box with two knobs – one labeled A and the other B.If A and B are independent, I can adjust A or B independently to target the process, or any combination of A and B.However, if A and B are not independent and I can’t set A without deference to B’s setting, due to the interaction between A and B, a reasonable man might conclude A and B are not independent.Therefore, ths substance of my argument is the presence of the interaction AB is an acknowledgment A and B are not independent.Do you think this is a reasonable conclusion. If not where is the fallacy?Cheers,
Mark0April 10, 2008 at 12:25 pm #170934
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.The problem is you are trying to make inferences concerning the relationship between X1 and X2 on the basis of Y. If you go back to the posts below and reread the definition of orthogonality and note how it is determined you can see that Y does not enter into the discussion or the equations. Therefore, Y’s behavior has nothing to do with the relationship (or lack thereof) between X1 and X2.
https://www.isixsigma.com/forum/showmessage.asp?messageID=139371
https://www.isixsigma.com/forum/showmessage.asp?messageID=139451
One additional point – you said “This seems to imply the two curves y1 and y2 are highly correlated?Therefore y1 and y2 are not independent.”
It is true there are two lines but they are not the result of two equations. They are the result of a single equation and their plot is a function of X1 and X2. Their slope differences is due to the presence of an X1X2 interaction. If it wasn’t there then they would have been two parallel lines.
Y = 6.23 +1.76*X1 +2.76*X2 +1.23*X1*X2
To see this just ignore the interaction term in the above and plot the lines that would be the result of
Y = 6.23 +1.76*X1 +2.76*X2
If you want to graphically assess the relationship between X1 and X2 you would plot them against each other i.e. X1 on the X axis and X2 on the Y axis. The plot, for the orthogonal matrix we have been using for this discussion will look like the four corners of a box. If you regress X1 on X2 for the perfect matrix (don’t add the single replicate) you will get a slope of 0 and a correlation of 0 – the same would be true for X1 against X1*X2 or any other combination of the X terms. With the replicate you will get a nonzero slope but it will not be significant and if you had the capability to test the matrix including the replicate you would find that X1, X2, and X1*X2 are orthogonal with respect to one another.0April 10, 2008 at 1:00 pm #170939I don’t understand why it’s a problem to make inferences about the relationship between X1 and X2 based on their responses. On the contrary, I believe a problem arises when such inferences are not made.As for there not being two equations, I plotted both and I can derive an equation for each. (?)Let me give a practical example for the black box I proposed in another recent post.Consider the exposure setting on a camera, which is a function of Aperture size and exposure time. Can we really claim the aperture setting and the exposure setting are independent of each other? I don’t think so because I have to change the exposure time depending on what aperture I choose. I suspect you would call it a synergistic effect.So what is the origin of the socalled interaction? It arises because our choice of response – the amount of illumination striking the film. The latter is not a good response because illumination is not proportional to a linear combination of aperture size and the exposure time. In other words, these two variables do not act independently. We can call it an interaction if you like, but to just give something a name does not imply the concept is useful.In this case, as is many other cases, the cause of the dependence is inherent in our choice of the response. In other words, the dependency is ‘builtin’ to the response.If I change the response to work – the amount of light energy striking the film (illumination squared) – the interaction disappears and each factor – aperture size and exposure time act independently of each other :)Mark
0April 10, 2008 at 2:20 pm #170946
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.The issue was that of independence of the X’s. The definition of independence is that of orthogonality and the definition of orthogonality does not involve Y.
As for the equations – yes, of course you did and you did it by ignoring the second X – that is you built an equation relating Y and X1 using the two data points where X2 was 1 and then you built the second where X2 was fixed at 1. In other words, you built two equations where your value of X2, for each of the equations, is a constant. The question would be , given that X1 and X2 are independent as defined previously, why do this? If you analyze your data this way you run the risk of missing the interaction of the two variables and, if this were some kind of venture in efficient production of product, knowledge of that interaction could be the difference between profit and loss.
One area where interactions are a fact of daily life is the world of applied chemistry. Consider sulphur, saltpeter, and charcoal. They are three very independent and distinct things. I can vary their ratios in a mix in any way I choose. Once I have a mix I can apply heat to it and for a number of mixes of the three I will get nothing more than a bunch of smoke and some residual goo.
If I put together the right combination and apply heat I’ll blow a hole in the side of my house and they will scrape my remains off of the kitchen ceiling. That result is not due to a simple linear combination of the effects of the three variables it is an interaction. Interactions are the heart and soul of thousands of very profitable chemical processes. Without an interaction these processes wouldn’t exist.
I’m confused by your discussion of aperture and exposure time since at the start you seem to be saying they aren’t independent but at the end you seem to be saying they are. However, I’ll offer the following:
For aperture and exposure time – there is nothing that says you have to change one when you change the other. I do large format photography and I have complete control over how I make those settings. The response to my settings will be whatever it is – I may decide I want a particular response and so to get it I will make changes in one or both but it isn’t the response that is driving the settings it is my desire to achieve a particular response that will lead me to choose a particular setting…and the only way I can know which setting to choose is because I experimented and varied those settings in some independent way. It is the results of my analysis of those experiments that allowed know what kind of a response I could expect when I used a particular combination of aperture and exposure.
0April 10, 2008 at 3:29 pm #170952The point I tried to make was the origin of the interaction X1*x2 is a dependency between X1 and X2, because of our choice of response.I can’t comment on your excellent chemical example, because I don’t understand enough about chemical reactions, but isn’t there a fourth component – heat? Who can say what happens to the mixture of three elements in the microsecond before the mixture explodes – perhaps all four components suddenly fuse, at which point they are no longer distinct and independent.As for the photographic example, you can of course create many different kinds of images – some overexposed and some underexposed. I had hoped you would assume I was referring to the correct exposure condition. You can either measure the amount of energy falling on to the paper, or you can run an experiment to determine the correct exposure, as judged by edge acutance and grain size.Anyway, thanks for humoring me!Mark
0April 10, 2008 at 3:57 pm #170954
ValleeParticipant@HFChrisVallee Include @HFChrisVallee in your post and this person will
be notified via email.Mark,X1 and X2 in your camera example are independent. The increase of one or the decrease of the other does not have an effect on one or the other. Now the manipulation of one or the other can have an effect on the final picture which can then be put into a formula to get the best picture. If you hold one X constant then of course you will have increase or decrease another X if not add another one to the equation like the type of film you select. I like the sweet tea example… cold tea is not good without dissolved sugar in it. The sugar does not dissolve effectively unless put into hot water first; therefore, temp has an effect on sugar’s ability to dissolve. Robert, I usually don’t get involved in your responses because in most cases you make me dig up my old texts books has a refresher. I just couldn’t help myself. HF Chris Vallee
0April 10, 2008 at 5:32 pm #170960HF,EV = log2(N^2/t)Perhaps Robert would be kind enough to explain it to you.
As for myself, I don’t think debate is dead, and I think there is still plenty to debate about in the world of Six Sigma.
Mark0April 10, 2008 at 5:53 pm #170962
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.I think we might be getting bogged down in semantics here but if your point is best summarized by the first sentence of your most recent post – “the origin of the interaction X1*x2 is a dependency between X1 and X2, because of our choice of response.” then maybe we have a way out of this.
If you know the relationship between the X’s and the Y because of prior work and you have quantified that is some fashion (say a predictive equation) then you can use that equation to identify those combinations of X1 and X2 that will give you a preferred Y.
In that kind of aftertheexperimentshavebeenanalyzedandtheequationbuilt sense, yes Y is driving the choice of X1 and X2 but it isn’t driving the relationship because when you did the initial work to identify the relationship between Y and X1 and X2 you did it by changing X1 and X2 and recording whatever value of Y resulted – not the other way around.
An aside – heat merely acts to get the reaction going. If suplhur, saltpeter, and charcoal are not in the proper proportions you can heat the mix until you are the best customer of the local natural gas company and nothing of any import will happen.
An examination of the residue after an explosion will show that it contains sulphur, saltpeter, and charcoal – in fact, it is by examining the residue after an explosion has occurred that forensic experts are able to tell what kind of an explosive was used.
The reaction is that the charcoal provides fuel to the sulphur and the saltpeter kicks in with a huge dump of oxygen. In the right ratio the oxygen dump is so overwhelming that what you get is a very fast burning fire. If a fire burns fast enough we call it an explosion.
0April 10, 2008 at 6:52 pm #170966
ValleeParticipant@HFChrisVallee Include @HFChrisVallee in your post and this person will
be notified via email.Mark,I understand the formula; however it is still an optimized manual process that is effected by the end result need not by the dependency of one on the other. Key word here being dependency. HF Chris Vallee
0April 10, 2008 at 7:06 pm #170968Robert:I really appreciate your contributions – I’m just trying to stimulate debate and reasoned argument.Yes, I fully understand your position and I really appreciate you’re willingness to consider my counter arguments to what is essentially Box’s position.Your use of the chemical argument for interactions it is particularly difficult to overcome from the Taguchi perspective, but rather than dismiss him out of hand, I’ve tried to walk in his shoes and see his worldview.As far as I’m concerned, you’re pretty unique amongst posters on this site, and while it is not my intention to belittle anyone else, there are many others who appear to take the view – it’s like this because soandso said so. How can these people teach in a classroom where difficult questions are par for the course?We now live in a world that seems bereft of reasoned argument and my hope is Mike, Alister and HF Chris will get together and decide what needs to be done to improve Six Sigma.I defer to you in this instance and look forward to debating other issues with you in the near future.Best wishes,
Mark (an alias for the Grail fool)0April 10, 2008 at 7:09 pm #170969HF,My argument is more easily seen if you change the subject of the formula:For a fixed EV – say the correct exposure, aperture depends on exposure time.Cheers,
Mark0April 10, 2008 at 7:27 pm #170970
ValleeParticipant@HFChrisVallee Include @HFChrisVallee in your post and this person will
be notified via email.Mark,Thanks, I actually looked up the formula in relation to photography to see the connection. However, if you look at the sugar example with sweet tea, you are not manual changing the properties of tea, the temp of the water is; thus there is a dependency and interaction; whereas in your photo example these are optimal settings dependent on the output desired (double exposure, white light etc.)
0April 10, 2008 at 8:55 pm #170980HF,I’ve only had a brief introduction to dissolution rates, and I’m not sure how it applies to sugar and tea.Does the dissolution rate of sugar in tea depend on an interaction between the temperature of the water and concentration of tea? I have no idea.Even if we consider stirring as a factor, how do we know if it is an interaction or a just a linear combination of tea concentration, the amount of sugar, the stirring rate and time, and the water temperature?I previously tried to argue that if all these factors are independent, the interaction term will be zero, and the converse – if the process is described by an interaction, then the factors are not independent due to the choice of the response – in this case the dissolution rate of sugar.If I put my Taguchi hat on again – since I can’t find an equation describing the dissolution rate as a function of the permeability and precipitation of sugar, I’d use sliding levels to avoid any possible interaction, or try to find a response closer to work done (energy.)Cheers,
Mark (an alias for the Grail fool)0April 10, 2008 at 9:19 pm #170981
ValleeParticipant@HFChrisVallee Include @HFChrisVallee in your post and this person will
be notified via email.Mark,I grew up in the south. Just try to mix sugar in cold water no matter how strong the tea is and it just sits on the bottom no matter how sweet you want the tea to be (the photo quality), there is no optimal mixture (aperture etc.). Now if you take your optimal photo settings you can see a correlation and possible interaction but it it not cause and effect just controlled optimal correlation. If you don’t change your aperture setting it has no affect at all on the exposure time setting, just your photo quality. Not sure if Robert would agree but this has always worked conceptually for me.
0April 11, 2008 at 3:04 am #170986
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.I was wondering if the view you were considering was Taguchi’s. If you are trying to understand his point of view and understand the “traditional” view and reconcile the two you will have to spend some time understanding the mathematics and the manipulation of the elements of a design matrix. In particular, you will need to understand orthogonal in the mathematical sense and you will have to understand how design matrices are manipulated by a computer to extract the measure of the effect of a variable in the design.
You will also need to understand that an awful lot of the supposed differences in the two methods is nothing more than word play. To this last point I’ll offer the links below to a couple of posts of mine. The first is an overview of some of the issues you might want to consider. The second series of posts are part of a thread. I’ve listed the posts in the back and forth order in which they appeared – two are mine and one is from one of the individuals with whom I was conversing. The entire thread was a good discussion and you may want to read all of it. If not, the three listed are the ones that are central to the issue of method comparison.
https://www.isixsigma.com/forum/showmessage.asp?messageID=108236
Discussion series: We were discussing the issues surrounding the choice of a design to analyze a process. One of the posters asked for an example of the Taguchi approach. The the individual who posted the following offered an example from Roy’s book and I offered the “traditional” approach to the same problem
https://www.isixsigma.com/forum/showmessage.asp?messageID=113397
https://www.isixsigma.com/forum/showmessage.asp?messageID=113415
https://www.isixsigma.com/forum/showmessage.asp?messageID=113428
As I mentioned in the thread, from my personal standpoint, as long as you are thinking design and not oneatatime or wonderguessputter, I don’t care which design you use.
One thing to consider when you are comparing the two is this – if Taguchi doesn’t think there is such a thing as an interaction then why hasn’t he made this belief a part of his designs. If I have a fractional design of an inner array and a full factorial design for an outer array then that full factorial is needed only if you have an interest in chasing the effects of outer array interactions. Furthermore, the multiplication of the two designs will result in experiments that are necessary only if you have an interest in checking interaction effects between inner and outer array elements.
The same issue with respect to inner and outer array elements holds true if both the inner and the outer array are fractional factorials. So if Taguchi really doesn’t believe in interactions then why hasn’t he, or one of his followers, done the obvious and removed those unneeded experiments (costs) from his design matrices?
As a final thought please don’t think I’m going to go off on some kind of a Taguchi bash. I don’t do bashing or flaming – they are a waste of time and computer key strokes. The only reason for offering the above is because you give me the impression you are trying to get a better understanding of issues and concepts and I hope the above will be of some value.0April 11, 2008 at 8:45 am #171002
The Grail FoolMember@TheGrailFool Include @TheGrailFool in your post and this person will
be notified via email.HF,If you grew up in the south you ought to know you put your sugar along with your lemon in your cold tea before you stand your tea in the sun :)I can’t speak to the sugar dissolution problem because it’s outside my experiential comfort zone :) However, I’ve a vague familiarity with some optical concepts.Let me give you another optical example, one I’m sure you’re familiar with.The resolving power of an optical lens is given by:r = 1.22 Lambda/ D where lambda is the wavelength of light and D is the size of aperture.If Lambda and D are independent, we might speculate a relationship as follows:r ~ 1/D + LambdaUnfortunately, this is clearly wrong because when D= 0 there is still some resolving power. Therefore, r has to comprise an interaction between 1/D and Lambda. In other words, D and Lambda are not independent. Let us now consider the origin of this dependence.When light waves impinge on an aperture, the resulting pattern of light intensity is known as interference, the result of which is to increase the blur size in the image plane. In other words, lambda and D are not independent as we might expect. Once someone chooses a resolving power and wavelength of light, they loose the freedom to choose D, even though there is an obvious button to twiddle. It’s a bit like having five numbers, but only four degrees of freedom with which to calculate standard deviation. Of course one is free to obtain the wrong answer …Does this imply there is a correlation and by implication no cause and effect between Lambda and Aperture? Who knows – it depends on your choice of response.Cheers,
The Grail Fool0April 11, 2008 at 10:17 am #171005
The Grail FoolMember@TheGrailFool Include @TheGrailFool in your post and this person will
be notified via email.HF,I thought most people down south put sugar and lemon in their tea before standing it in the sun:)Anyway, I’ve thought of another example that you might be more familiar with – resolving power, r.r = 1.22 L/D where L = wavelength and D = ApertureIf you want a particular r to identify a tank at 4,000 meters, then you’ll need a fairly large hole, D, in your cupola. Since you can’t choose a wavelength outside a certain band, you’re stuck with the large hole – even though in principle you can select an aperture of any size.The interaction describing the relationship between r and L arises because of interference in the aperture plane and consequential blurring in the image plane; so there is a clear dependency or correlation between D and L, or between r and L.Is this cause and effect? I’d say so provided you’ve studied the relationship using DOE. In fact this is one of the main advantages of DOE – because the method does not rely on unbiased happenstance data.As far as the camera model is concerned it is more clearly seen if you pose an incorrect relationship – one that relies on the independence of D and t.Clearly, the EV does not ~ D + t because when t = 0 the equation predicts a value of EV.This is why we need the multiplicative form that corresponds to an interaction. (Not the same as a reaction.)Cheers …
0April 12, 2008 at 2:41 pm #171087
ValleeParticipant@HFChrisVallee Include @HFChrisVallee in your post and this person will
be notified via email.Mark (The Grail Fool)
Have not ignored the post, just need to do a little research on your example and I am leaving for a facilitation for the next 10 days. I would like to understand the concept giving based on your example before I repsond.
Thanks0April 12, 2008 at 4:21 pm #171090
The Grail FoolMember@TheGrailFool Include @TheGrailFool in your post and this person will
be notified via email.No problem Chris, I look forward to continuing our discussion.While researching, you might want to take a look at the following article:http://www.geocities.com/jswortham/gunpowder.htmlI should also acknowledge my training under Dr. Taguchi, as I don’t want you to think my arguments are in any way original. I also don’t want to imply I’m sufficiently competent to represent his views.Specifically, Dr. Taguchi often recommends trying to find responses closely related to energy in order to minimize interactions.(Although Taguchi Methods is often not considered part of Six Sigma, it was in Austin. Someone changed Six Sigma and we ought to get back to what worked as quickly as possible. As has been mentioned in this forum previously, early Six Sigma even included much of TPS – hardly surprising given the number of Japanese engineers who came to Austin.)My understanding of Taguchi’s justification for his approach is the question of stability. This is my interpretation – I may get it wrong.If the process can be modeled by a linear equation, such as:y = x1 + x2, where y1 and y2 are independent, then the error is just:dy/dx = dx1/dx + dx2/ dxOn the other hand, if y = x1*x2, then the error will be:dy/dx = x1(dx2/dx) + x2(dx1/dx)So you can see the potential for error (variation) is much greater with dependent variables. (By dependent I mean the level of one determines the level of the other. This is why Taguchi recommends trying to avoid setting up processes with interactions.From a practical point of view, if a piece of equipment has two knobs A and B, there is no knob labeled AB anyway, which is a distinct disadvantage on the shop floor.Good luck with the facilitation ..Cheers,
TGF0April 12, 2008 at 4:46 pm #171091
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.If you are trying to understand the similarities/differences between Taguchi and “traditional” you might find this old post of mine of some value.
https://www.isixsigma.com/forum/showmessage.asp?messageID=108236
Interactions:
I can’t speak to the mathematics you’ve offered concerning Taguchi’s justification for avoiding interactions but I can offer the following concerning the knobs on the factory floor – there doesn’t have to be a knob labeled AB, you have control over A and you have control over B and you set them at the levels identified by the regression equation that will take the most advantage of the interaction term.
One additional thought is this – if Taguchi really doesn’t think there is such a thing as an inteaction then why hasn’t he, or one of his followers, reduced his designs to reflect this? If you take an inner array and an outer array and multiply the two together you get experimental combinations whose only value is that they will provide you with information concerning interaction terms.
0April 12, 2008 at 6:54 pm #171093
The Grail FoolMember@TheGrailFool Include @TheGrailFool in your post and this person will
be notified via email.The preference for avoiding interactions in the inner array also applies to main effects and noise factors in the outer array. Unfortunately, a linear combination of process settings and locations within processes is as yet unknown :)
0April 12, 2008 at 9:23 pm #171094
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.I’m not sure we’re talking about the same thing. Let’s take a simple example.
2 process variables (A,B)and two noise factors (P,Q)each at two levels.
The inner array is
A B
1 1
1 1
1 1
1 1
and the outer array is
P Q
1 1
1 1
1 1
1 1
If we multiply these together we get the following
A B P Q
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
which is identical to a 4 variable 2 level factorial design. This means that there are experiments in this design which allow for the investigation of AB, PQ, AP,AQ, BP, BQ, and all of the three and the 4 way interaction as well.
If you were going to just look at main effects, this design would shrink to the following:
A B P Q
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
and, you could probably knock this down by a few more if you wanted to run the above through a Doptimal procedure since even this design allows for the investigation of a couple of the two way interactions.
This same issue applies to other multplications of inner and outer arrays.0April 13, 2008 at 12:22 am #171098
ValleeParticipant@HFChrisVallee Include @HFChrisVallee in your post and this person will
be notified via email.Robert,
Maybe I misunderstood his original post. But was he not looking for whether variables could be independent and still have interaction? To put it simpler, you can use a linear regression formula to find an optimal output (the best fit of all weights) but this does not indicate dependency of the x’s. The transformation of one x caused by another however , would indicate dependency of this was required to get the correct output….also thanks for the update on orthogonal rotation in the other post. Turns out I did remember something form the old stats days.
HF Chris Vallee
0April 13, 2008 at 9:51 am #171104
The Grail FoolMember@TheGrailFool Include @TheGrailFool in your post and this person will
be notified via email.You’re correct, we’re not talking about the same thing, which is an important point we’ll have to come back to later.Your outer/ inner array point is also well taken – I performed the same analysis myself in 1986. Surely, before discussing a model, we must first decide what assumptions have been made about the nature of the process – whether they’re valid or not – the responses, the control factors, the noise factors, and any relationship between factors, before we can make such a decision?Or do you claim the structure you propose is independent of process.In order to clarify the last point, may I suggest we define a process in terms including, but not limited to: chemical components, chemical concentration, mechanical action, time, and temperature.(By mechanical action, I’m referring to physical variation within a process.)
0April 13, 2008 at 10:01 am #171105HF,No, my contention is truly independent variables do not interact. The fact that an interaction exists suggest that what we previously thought of as independent, by putting the variables in columns where they ought to be independent, was mistaken.As justification for this I’ve cited Taguchi’s position that interactions by and large depend on our choice of response.For example, if instead of using EV as a response we used log EV, the interaction disappears!I’ll leave it to others to decided whether this is problematic or not.I hope you and Robert are enjoying the discussion. If it’s frustrating – I’ll just go away :)
0April 13, 2008 at 2:58 pm #171107
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.No frustration here TGF I’m enjoying this exchange. The problem would appear to be that of contention verses mathematical definition. If I have two columns that meet the requirements for orthogonality and if I multiply the two together I can generate a third column which is orthogonal to and independent of the other two. This is just a fact of mathematics.
This kind of behavior is not something that is limited to design matrices. In vector spaces a cross product of two orthogonal vectors will generate a new vector which is orthogonal to, and independent of, the original two. The i,j,k unit vectors and their relationships are central to much of physics.
As for stating that the interaction is dependent on the choice of response I think one should think in terms of the kind of response one gets when one varies variables that are independent of each other. The relationship of the response to the independent variables will be whatever it is. In some cases it will be a simple additive linear relationship but in others cases the relationship will be multiplicative. As I mentioned before, applied chemist are always on the lookout for combinations of independent factors that will result in a multiplicitave as opposed to a simple additive effect and they express their findings in terms that reflect this.
Shifting subjects: I got the impression from one of your earlier posts that you had studied directly under Taguchi. If this is the case and if you have run a lot of Taguchi designs in industry I would be interested in your thoughts on the following:
In my career there have been perhaps a dozen or so times when I was called into look at a process after someone had tried to run a Taguchi design and had come up with not much. In looking over the actual settings of the design – particularly for the defined noise variables – more often than not I’ve noticed a very poor level of control with respect to the settings for these variables. By this I mean the following:
Let’s say I have a noise variable and I have defined its low level to be 5 and its high level to be 15 and I’m only interested in running a two level design with this variable. What I have seen is that instead of contolling the low level to some reasonably tight region around 5 say 46 and the high level to a similar level say 1416 what I find in the data sheets is that the low level can be anything from say 210 and the high can be anything from 1220. In other words I have a “fuzzy” low and and “fuzzy” high setting for the variable. (This lack of control was not due to the fact that the investigators couldn’t achieve a tighter control it was just the fact that they were under the impression they didn’t have to). This, by itself is no big deal as long as one takes this into account when running the analysis.
What I have seen in these cases is that the individual who did the analysis substituted 1 for a code for the low setting and +1 for a code for the high setting regardless of the actual low or high value. If there really is an inteaction present and if you code and analyze your data in this fashion you are almost guaranteed to never detect it. The proper way to code your data would be to use the actual minimum and maximum settings as 1 and 1 and then scale all of other actual values to something in between. If your excursions away from the ideal minimum and maximum are extreme enough you may very well find that the intaction term for the actual X matrix is either not measureable(i.e. it is not independent of the two terms used to generate it) or it is looking for an interaction over a much smaller region than the region you examined and in that region it is not significant.
My question is this – is this lack of control with respect to a noise variable part of the Taguchi philosophy or is it just a case of a biased sample in terms of the few cases I’ve seen. If it is part of the operating philosophy I could easily understand how a person could come to the conclusion that interactions don’t exist.0April 13, 2008 at 3:37 pm #171108Robert,I’ve only time to address your first comment today. I’ll give the other some thought tomorrow.Yes, I agree, but the origin of the problem is not the orthogonality of the model, it is the assignment of variables that were thought were independent, but are not.If variables are truly orthogonal the magnitude of the interaction vector would be equal to the null vector, or as close to null as could be expected with experimental error.Another simple fact of mathematics is if a* b = y then by implication a = 1/b y and a depends on b, even though y is orthogonal to both a and b. If this is true, then clearly a is not orthogonal to b …This why some procedures use Gram Schmidt orthogonalization .. to ensure all variables are orthogonal, including cross terms.Before we digress to your other comment, I’d like to come back to the justification for taking a factor in the outer array and putting it in the inner array. (Pf course, excluding your rather excellent example of how to convert humidly from an uncontrollable, noise variable into a quasicontrollable variable.)As soon as we discuss that I’d be happy to discuss your other important observation. TGF
0April 13, 2008 at 8:59 pm #171114
ValleeParticipant@HFChrisVallee Include @HFChrisVallee in your post and this person will
be notified via email.Going with Robert’s mathematical perspective, I would like to see what your true definitions of dependent and independent variables are? Because if I remove the output parameter, the x variables you described in numerous posts do not correlate to each other and have no affect on each other when paired. The dependency then comes from its relation to the output not the relation to each other.
0April 14, 2008 at 5:40 am #171125Agreed …The dependency of x2 on x1 and vice versa in the interaction x1*x2 is by virtue of their relationship with the response, y.If we change the response y, then the interaction dependency may well disappear, such as when I transform y using the log transformation.My previous comment to Robert about ‘structure’ was directed towards the interactions between inner array factors and factors in the outer array. On the surface, this practice gives a false impression of inefficiency and Robert is quite correct to question it.By transferring a factor from the outer array to the inner array, something has been assumed about the process.
0April 14, 2008 at 12:23 pm #171132
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.The only relationship between an independent X1 and X2 and a dependent Y is the fact that (assuming cause and effect) Y is a result of changes in X1 and X2 whereas if I change Y its change may or may not have anything to do with X1 and X2 – this is the whole reason for randomization of experimental runs. If you try to couch the discussion in the form that X1*X2 = Y is the same as X1 = Y/X2 and thus this means X1 and X2 aren’t independent then you have the same problem if you say
X1 +X2 = Y since X1 = YX2
As for tranforms, such as the log, rendernig an interaction insignificant – I can imagine situations where this could happen, however, this isn’t what I normally see. What happen is the following: I will run stepwise and backward elimiation, identify a possible final model, run the regression diagnostics and find that the residual plots indicate I need to do something to the data to better insure an accounting of the signal present in the system.
If I get a shape in the pattern the first choice is going to be a log of the Y or the X or perhaps both. If I do this and the residual pattern changes to a picture of a matzo ball then I will be satisfied that the transform was appropriate. However when this done, the coefficients will change but, in general, the significant terms remain. Which means if the interaction was present without the transform it is still there after the transformation.
One thought – is there any chance this issue is nothing more than the choice of the word “interaction”? If this is the case then we can always call the effect what it really is – a multiplicative effect of two independent variables. If this isn’t the case then it would appear that Taguchi’s position is that the only effect two independent variables can have on any outcome is that of simple addition… and that is contrary to what is known about a large number of physical and chemical processes.
This also brings up the other question I had. In the post I wrote about the differences between Taguchi and traditional I quoted the one poster who claimed Taguchi said “When there is no interactions among control factors, DOE is a waste – you can use OFAT and get the same result.”
As I said at the time I hadn’t gone back to check this and I must admit I still haven’t but since you are a practioner of the Taguchi methods could you check your books and see if a) Taguchi does indeed say this and b) if it has been taken out of context and thus does not really mean what the stand alone sentence implies.0April 14, 2008 at 1:12 pm #171137I thought the whole point about y = X1 + x1 is X1 can be adjusted to a target without having to reset X2, and vice versa. By way of contrast, setting an interactions, seems much more confusing, because there isn’t an X1*X2 knob on process equipment.Thank you for admitting a modification of the response can minimize an interaction effect in some circumstances – I’ve seen many situations where it does works. (Perhaps as many as 1000 experiments.)Many companies in Japan, including Toyota, Hitachi, Toshiba, and Fujifilm use Taguchi Methods with great success.As for your last comment, I can confirm he holds this view, but I have no idea how to argue this in his behalf, so I’m not even going to try.Neither will I argue his lack of randomization.Although I’m willing to try and justify his use of an inner and outer array.TGF
0April 15, 2008 at 4:08 pm #171194
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.Lets try for a recap.
Assumptions:
1. The accepted mathematical definition of independence is to be replaced with a nonmathematical conjecture that somehow Y needs to be part of the definition of independence between two X variables.
2. Interactions are an artifact that can be made to disappear by running some kind of transform on the Ys or the Xs.
This would suggest the following:
I have a standard 2 variable 2 level design in the following form
A B AB
1 1 1
1 1 1
1 1 1
1 1 1
I apply the Taguchi Transform and the AB column vanishes, which means any effect I might attribute to AB vanishes and is somehow folded into the effect of A and B.
Next I decide to run a Taguchi design. I have 3 process variables and I choose the standard 2 level fractional factorial for these 3 variables. The matrix is as follows:
A B C
1 1 1
1 1 1
1 1 1
1 1 1
Therefore when I apply the Taguchi Transform the effect of C disappears and is folded into the effect of A and B. Whether I was turning knobs or not is irrelevant the effect of C is computed in exactly the same way that the AB effect is determined in the first matrix.
If it were true that a transform would always obliterate a column representing an interaction, and thus its effect, itwould mean Taguchi could only use full factorial designs to build his inner and outer arrays since fractionation automatically confounds one or more main effect variables with one or more interaction terms.
I appreciate that Taguchi believes he can make interactions disappear and I would be willing to believe there might be occasions when this could happen (just as I would believe you might be able to make a main effect disappear).
With experimental design data I also think these occasions are rare if it was commonplace no one would have wasted all of the effort that has been and is being expended on interactions in the world of statistics and the concept of an interaction would be nothing more than a simple footnote in advanced books on experimental design.
Given the number of times Ive had to transform data due to the dictates of the residual plots Im also quite certain I would have noticed this stated effect on interaction terms. Admittedly, I havent run 1000 experimental designs Ive probably averaged around a design a month during my career which would translate into something in the region of 250 but still, most of them have involved lots of interactions and I must admit I cant recall ever seeing this effect.
Rather than belabor the point – what would be of value would be a real example. Given the number of experiments you have run is there any chance you could provide the raw data from a design where this effect can be observed?
The above should not be interpreted to mean I think Taguchi designs are worthless. I know various and sundry people have had successes using them. Given that they are just modifications of proven, existing designs this isnt particularly surprising and, as I mentioned in the post contrasting the two, I dont care whether you choose to run one or not. My only concern would be that you base your choice on an understanding of the mathematics and the real similarities and differences and not on some misapprehension concerning something such as the confusion surrounding the word noise.0April 15, 2008 at 4:24 pm #171195
Robert SutterMember@RobertSutter Include @RobertSutter in your post and this person will
be notified via email.DOE is an acronym for Design of Experiments. This is a process improvement method where key input variables from the Analyze phase are manipulated in a controlled experiment. The purpose of the experiment is usually to optimize the settings of the key input variables. When a DOE is conducted the statistical analysis of the data can be used to infer causality.
Regression analysis can be used to analyze the data collected during a DOE. However, regression analysis can also be conducted on data that is not collected from a DOE. In this instance inferences about causality cannot be made.0April 15, 2008 at 5:37 pm #171197Robert:A great thread.I’d like to make it obvious that what people using the term “historical DOE” are, in fact, doing regression and not DOE.Given that, they should note all the caveats in the above thread.Cheers, Alastair
0April 15, 2008 at 8:08 pm #171207I’m am certainly willing to provide some data, provided I’m satisfied you understand the process. This seems perfectly reasonable because the last thing I want is someone running around halfcocked!My justification for caution is as follows:There are many practitioners in fields that do not encounter certain problems. In this sense, practitioners are truly the product of their environment.For example, in the manufacture of mechanical components, raw materials do not vary as much as they do, for example, in the manufacture of rubber seals. (In some mechanical processes you can run a first off and produce 5,000 components before seeing burrs. Unfortunately, I know nothing about the aircraft industry and defer to you.In some industries the ability to adjust a process is critical – even when you have assay for each batch and physical properties, which is why Taguchi places a lot of emphasis on the additivity of effects.In the semiconductor industry, process engineers by and large have an excellent understanding of how to target the process average; what they don’t tend to have is a good handle on is variation within a run. If you have this sort of background – all well and good, but you can hardly blame me for wanting to calibrate our world view, especially as some author’s examples of Taguchi methods have been puerile to say the least – so I can certainly understand your skepticism.On the other hand, several thousand case studies have already been published in the proceedings of the ASI. Yet more are in Crevelling’s and other books. (By the way, I have no connection with ASI and I’m not seeking to promote their views. On the contrary, I’d like to see traditional DOE adopt some of Taguchi’s ideas, which is why I’m visiting the forum. Neither do I consider myself a practitioner of Taguchi Methods, although I’ve used it extensively in the past. I believe some of Taguchi’s ideas are sufficiently interesting to bring them to the attention of members of the forum. By the same token, there is much about Taguchi Methods I don’t agree with, or do not understand. In this spirit, I’m willing to continue our discussion.Of the thousand or so experiments I referred to, some were undertaken by myself, others were conducted by other process engineers at Motorola. Every one of the experiments used a log transform, but since many authors have misrepresented Taguchi Methods, or have used too simplistic examples, which would be better served by traditional DOE, I’m not surprised you’re skeptical, which is why thought to provide a better explanation of Taguchi’s work – bearing in mind my own limitations.(By the way, I’ve provided information concerning a substantial portion of the 1000 or so experiments to a member of this forum to check independently.)Please be aware there are those among us who would seek not only to promote themselves, but at the same time undermine the efforts of others. As a consequence there is little information available about the large number of engineers trained in Taguchi Methods prior to 1988, and the large number of Japanese engineers and managers who came to Motorola to help them achieve worldclass performance.Coming back to your example 3 factor example, I don’t see why if the AB interaction is zero, why the estimate of the main effect C would be ‘folded over’ or aliased with A, or B, or with both.I’m also perceive a difference in our approach – I would like knobs to be independent, while you seem to place more emphasis on columns in a factorial.Anyway, before we speculate further, isn’t it important to first establish the response/s?Unfortunately, before we can discuss the response we need to calibrate our process engineering objectives, and part of that discussion must be the difference between the inner array and the outer array. I keep stressing this because I don’t think it is what you expect :)
0April 15, 2008 at 8:09 pm #171208I’m am certainly willing to provide some data, provided I’m satisfied you understand the process. This seems perfectly reasonable because the last thing I want is someone running around halfcocked!My justification for caution is as follows:There are many practitioners in fields that do not encounter certain problems. In this sense, practitioners are truly the product of their environment.For example, in the manufacture of mechanical components, raw materials do not vary as much as they do, for example, in the manufacture of rubber seals. (In some mechanical processes you can run a first off and produce 5,000 components before seeing burrs. Unfortunately, I know nothing about the aircraft industry and defer to you.In some industries the ability to adjust a process is critical – even when you have assay for each batch and physical properties, which is why Taguchi places a lot of emphasis on the additivity of effects.In the semiconductor industry, process engineers by and large have an excellent understanding of how to target the process average; what they don’t tend to have is a good handle on is variation within a run. If you have this sort of background – all well and good, but you can hardly blame me for wanting to calibrate our world view, especially as some author’s examples of Taguchi methods have been puerile to say the least – so I can certainly understand your skepticism.On the other hand, several thousand case studies have already been published in the proceedings of the ASI. Yet more are in Crevelling’s and other books. (By the way, I have no connection with ASI and I’m not seeking to promote their views. On the contrary, I’d like to see traditional DOE adopt some of Taguchi’s ideas, which is why I’m visiting the forum. Neither do I consider myself a practitioner of Taguchi Methods, although I’ve used it extensively in the past. I believe some of Taguchi’s ideas are sufficiently interesting to bring them to the attention of members of the forum. By the same token, there is much about Taguchi Methods I don’t agree with, or do not understand. In this spirit, I’m willing to continue our discussion.Of the thousand or so experiments I referred to, some were undertaken by myself, others were conducted by other process engineers at Motorola. Every one of the experiments used a log transform, but since many authors have misrepresented Taguchi Methods, or have used too simplistic examples, which would be better served by traditional DOE, I’m not surprised you’re skeptical, which is why thought to provide a better explanation of Taguchi’s work – bearing in mind my own limitations.(By the way, I’ve provided information concerning a substantial portion of the 1000 or so experiments to a member of this forum to check independently.)Please be aware there are those among us who would seek not only to promote themselves, but at the same time undermine the efforts of others. As a consequence there is little information available about the large number of engineers trained in Taguchi Methods prior to 1988, and the large number of Japanese engineers and managers who came to Motorola to help them achieve worldclass performance.Coming back to your example 3 factor example, I don’t see why if the AB interaction is zero, why the estimate of the main effect C would be ‘folded over’ or aliased with A, or B, or with both.I’m also perceive a difference in our approach – I would like knobs to be independent, while you seem to place more emphasis on columns in a factorial.Anyway, before we speculate further, isn’t it important to first establish the response/s?Unfortunately, before we can discuss the response we need to calibrate our process engineering objectives, and part of that discussion must be the difference between the inner array and the outer array. I keep stressing this because I don’t think it is what you expect :)
0April 15, 2008 at 9:44 pm #171213
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.I think I see the disconnect and if Im right then what we are dealing with is nothing more than a issue of words.
Before I say anything about that let me make one last statement about independence. Ive been trying to think of ways to explain the idea and I guess Ive failed. About all I can say is that for the matrix A,B and C or A,B and AB are independent and knobs to turn dont have anything to do with it. I would urge you to take the time to understand this idea. If you dont you will run the risk of misinterpreting a great many things with respect to the concepts of experimental design.
Heres where I think the disconnect may have occurred:
You said that at Motorola everyone is using logs and building regression equations.
Is there any chance the final models look like the following?
#1 Log Y = logX1 +logX2 ?
If the regression equations are in this form then the back transform of the equation is
Y = X1*X2
And the linear form is nothing but an interaction.
The kind of equations Im talking about would start out looking like
Y = b0 +b1*X1 +b2*X1*X2
When I examined the residuals I would find the usual shape. Under these circumstances a good first choice is to look at a log transform of the Y and rerun the regression. If the Y was the culprit then the residual plot looks like a shotgun blast and the final model is
Log Y = b0 +b1*X1 +b2*X1*X2.
When I asked for an example what I was looking for was a design matrix and a response that when modeled gave a form like
Y = b0 +b1*X1 +b2*X1*X2
And that when I took a log the model became
LogY = b0 +b1*X1 +b2*X2
If the log transform model you considering is that of #1 above then there we have resolved the problem – you haven’t eliminated an interaction you have just taken its log..
Finally, when you are building a design all you need is evidence that the variables you are interested in changing may have an impact on the response you are trying to measure. What you need to know are the variables you are interested in checking and what limitations may apply to them with regard to their combinations.0April 16, 2008 at 8:06 am #171227Yes, there is a disconnect, and I don’t hold any hope of being able to discuss any other of Taguchi’s ideas with you.I never stated: ” You said that at Motorola everyone is using logs and building regression equations.”My reference was to the 1,000 or so experiments, a portion of which were run more than twenty years ago! I have no idea what they do now – considering their failures, I can only assume they’ve gone back to TQM.In fact, in 1987 or so there were two facilities in Austin, one of which was a research facility. The research facility development was under the guidance of an eminent statistician, further supported by a Prof. at Texas University and a statistician based in California.The other team based in a production facility comprised three process engineers trained in Taguchi Methods, and traditional statistics, and a group of TPM operators. (At that time over 50 process and device engineers had been trained in Taguchi Methods, and traditional statistics.By chance, these groups both developed photo resist processes for 0.8 micron product. (Laughable these days, but back then quite a challenge.)In the end, the research facility had to transfer the photo process from the production facility because their process didn’t cut the mustard.I’m now going to ask the modulator to verify my statement. If he can’t, I won’t post again.If he does, then I’d like to ask you to be more open minded – and actually respond to some of my questions – we haven’t even discussed the goals of the process, or the responses.TGF
0April 16, 2008 at 8:07 am #171228Yes, there is a disconnect, and I don’t hold any hope of being able to discuss any other of Taguchi’s ideas with you.I never stated: ” You said that at Motorola everyone is using logs and building regression equations.”My reference was to the 1,000 or so experiments, a portion of which were run more than twenty years ago! I have no idea what they do now – considering their failures, I can only assume they’ve gone back to TQM.In fact, in 1987 or so there were two facilities in Austin, one of which was a research facility. The research facility development was under the guidance of an eminent statistician, further supported by a Prof. at Texas University and a statistician based in California.The other team based in a production facility comprised three process engineers trained in Taguchi Methods, and traditional statistics, and a group of TPM operators. (At that time over 50 process and device engineers had been trained in Taguchi Methods, and traditional statistics.By chance, these groups both developed photo resist processes for 0.8 micron product. (Laughable these days, but back then quite a challenge.)In the end, the research facility had to transfer the photo process from the production facility because their process didn’t cut the mustard.I’m now going to ask the modulator to verify my statement. If he can’t, I won’t post again.If he does, then I’d like to ask you to be more open minded – and actually respond to some of my questions – we haven’t even discussed the goals of the process, or the responses.TGF
0April 16, 2008 at 12:22 pm #171237RB,
Just wanted to thank you for continuing to post. I continue to learn a great deal from your insights. Thanks for the knowledge sharing.0April 16, 2008 at 1:09 pm #171239
Robert SutterMember@RobertSutter Include @RobertSutter in your post and this person will
be notified via email.Actually you can evaluate multicollinearity in regression using variance inflation factors. VIFs > 10 indicate multicollinearity.
0April 16, 2008 at 1:23 pm #171240
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.You said:
Of the thousand or so experiments I referred to, some were undertaken by myself, others were conducted by other process engineers at Motorola. Every one of the experiments used a log transform
and that is why I said
You said that at Motorola everyone is using logs and building regression equations.
Is there any chance the final models look like the following?
#1 Log Y = logX1 +logX2.”
I appreciate this was something of an extrapolation but if Im transforming all of my Xs and my Ys and attempting to understand the process it is highly likely that I will do more than just log the data. If I want to build a model for the data I will most likely use linear regression and if I do the models will be in the form of #1 above.
With regard to your last comment – I I’d like to ask you to be more open minded – and actually respond to some of my questions – we haven’t even discussed the goals of the process, or the responses.
The problem is that the last piece, which has been a recurrent theme in your posts, is directly connected to the issue of understanding the concept of independence. I will reiterate the issues concerning the construction of a design focus on the independence of the Xs and the independence of the terms of those Xs (curvilinear, interactions) that you as an investigator might want to check relative to their effect on the process and/or the response. When you are considering the levels of the Xs the only question you need to ask is if certain combinations will be difficult or impossible to run (e.g. you cant vary the temperature with every experiment because the thermal mass of the equipment wont permit it, you cant combine those chemicals at those levels because they will never form a solid, etc.) Once you have a design that meets these requirements the only question left is that of time and money. If the proposed design violates either of these then you make additional modifications to meet these requirements.
The response to a design will be whatever it will be. If you have done your homework prior to constructing the design there is a good chance that you will find relationships between the response and the things you varied what those relationships will be is unknown thats why you are running the experimental design. Once you have the information from the design in hand you can apply that information in ways that allow you to achieve the goals of the process and the desired characteristics of the responses.
The only way you can bring the goals of the process or the responses into a consideration of choices of the levels of the variables in the design or indeed of the design itself is if you ignore the mathematical definition of independence. Under these circumstances being open minded is tantamount to accepting your definition of independence and discarding the definition of independence upon which all experimental designs, including Taguchis, are not only built but also analyzed.0 
AuthorPosts
The forum ‘General’ is closed to new topics and replies.