# Choice of Experiment

Six Sigma – iSixSigma › Forums › General Forums › Methodology › Choice of Experiment

- This topic has 37 replies, 5 voices, and was last updated 7 years, 4 months ago by Robert Butler.

- AuthorPosts
- March 17, 2013 at 10:23 am #54335

Justin MGuest@justinkd**Include @justinkd in your post and this person will**

be notified via email.Hello all,

I am a student looking to design an experiment with 7 continuous factors at 3 levels. What method would produce the least amount of runs in your experience?

Is it possible to do a taguchi array at 2 levels with center points added?0March 17, 2013 at 5:00 pm #194873

Robert ButlerParticipant@rbutler**Include @rbutler in your post and this person will**

be notified via email.The bare bones minimum would be 10 experiments – a saturated 2**3 with two center points. This will buy you main effects and a check on curvilinearity but it won’t tell you which factor(s) is/are responsible for curvature. What you haven’t told us is what it is that you are seeking.

1. Do you care about curvilinear behavior for all of the variables or not or perhaps only some of them or????

2. Do you care about interactions? If so which ones? All 2 way? some 2 way? every possible interaction? or???

If you want all curvilinear and all mains and no interactions then you are looking at something in the neighborhood of 16-17 experiments which you could identify using a D or A optimal routine and if you are looking for interactions then the count goes up from there.

0March 18, 2013 at 7:02 am #194877

Justin MParticipant@justinkd**Include @justinkd in your post and this person will**

be notified via email.1. Curvilinear yes

2. Interactions not so important. Im doing a manufacturing process so I dont think there should be any.Which method is that for the 16-17?

And is center points added to taguchi orthogonal array acceptable? With a L8 array with 3 center points it comes up to 11 experiments0March 18, 2013 at 7:50 am #194884

Chris SeiderParticipant@cseider**Include @cseider in your post and this person will**

be notified via email.Consider just doing a 2-level factorial design with centerpoints and use an 1/8 fractional design but I’ll let Robert B. keep working with you if you insist on a Taguchi design but I find Taguchi less easy to use and takes good prior knowledge of interactions to design the best Taguchi design around interactions.

0March 18, 2013 at 8:37 am #194887

Robert ButlerParticipant@rbutler**Include @rbutler in your post and this person will**

be notified via email.L8 is just a standard 2^3 factorial design with a different name. If you run it saturated (7 factors in 8 experiments) and run a replication of two center points then you would have a total of 10 experiments. As mentioned before this buys you a measure of linear effects an a test on the possibility of curvature but not the ability to state which factor exhibits curvature.

I’ll second Chris’s comment concerning interactions – the fact of a manufacturing process does not guarantee a lack of interaction. Indeed, in my experience interactions are the norm. If, for a first look, you wish to ignore that possibility that is, of course, your call and it may well be one driven by the facts of time and money but I would not make the assumption that they are absent.

In the absence of additional information the most efficient design (fewest experiments) would be to build one using a D-optimal routine. As noted I’m guessing it would be around 16-17 runs.

0March 18, 2013 at 9:34 am #194891

Robert ButlerParticipant@rbutler**Include @rbutler in your post and this person will**

be notified via email.Ok, a quick check indicates a D-optimal design of 17 points with an additional 2 center points (for replication) will give you a design that meets the usual VIF and condition index critera and gives you what you said you wanted – all mains, all curvilinear, and center point replication.

Obs Var1 Var2 Var3 Var4 Var5 Var6 Var7

1 1 1 0 1 0 -1 -1

2 1 1 -1 -1 -1 0 0

3 0 1 1 0 -1 -1 0

4 1 -1 -1 0 0 -1 1

5 -1 1 -1 0 0 1 1

6 1 -1 0 0 1 1 0

7 -1 1 0 1 -1 1 0

8 -1 1 1 -1 1 1 -1

9 -1 -1 0 -1 -1 -1 1

10 -1 0 0 0 -1 0 -1

11 0 0 0 -1 0 1 0

12 0 -1 -1 1 -1 1 -1

13 0 1 0 1 1 0 1

14 1 0 1 1 -1 1 1

15 -1 0 -1 1 1 -1 0

16 -1 -1 1 1 0 0 0

17 0 0 1 -1 1 0 1

18 0 0 0 0 0 0 0

19 0 0 0 0 0 0 00March 18, 2013 at 11:44 am #194898

Justin MParticipant@justinkd**Include @justinkd in your post and this person will**

be notified via email.I was thinking the center points were to avoid using a three level experiment.

What does the 2- Level 1/8 fractional factorial do for me? 17 experiments seems high for me.0March 18, 2013 at 11:57 am #194899

Robert ButlerParticipant@rbutler**Include @rbutler in your post and this person will**

be notified via email.No, you can use center points with any group of continuous variables. As for the 1/8the fractional factorial – it guarantees that all of your main effects are clear of all two way interactions. It’s insurance in case there are unknown interactions present. If you use the satuated design then your main effects are confounded with the interactions

0March 18, 2013 at 12:03 pm #194900

Robert ButlerParticipant@rbutler**Include @rbutler in your post and this person will**

be notified via email.Well, so much for the edit command – it kicks you over to some website asking questions about a rock group. Anyway my first sentence in my last post is wrong. Center points in a two level design will NOT tell you anything about curvilinear effects due to a specific factor they will ONLY tell you that such curvature exists. If you want to identify factors associated with curvature you will need to run the individual factors at three levels.

0March 18, 2013 at 12:53 pm #194901

Justin MParticipant@justinkd**Include @justinkd in your post and this person will**

be notified via email.Yes this makes sense, otherwise there would be no point to a 3 level experiment ever. Hmm I’ll discuss this with my lecturer and see what I will be allowed to do. I see no point to doing a 3 level if i am so limited in the number of runs.

What software has D-optimal? I checked JMP and minitab with none.0March 18, 2013 at 1:29 pm #194902

Katie BarryKeymaster@KatieBarry**Include @KatieBarry in your post and this person will**

be notified via email.March 18, 2013 at 2:29 pm #194903

Robert ButlerParticipant@rbutler**Include @rbutler in your post and this person will**

be notified via email.I don’t know Minitab but a quick search on the web brought me right back to an old post (back in 2006) here on isixsigma which said with respect to D-optimal…

Yes, you can do this in Minitab.

To select design points achieve an optimal design using d-optimality or distance-based optimality:

1. Create the full design in Minitab. Choose Stat > DOE > Response Surface > Create Response Surface Design to create a new design or choose Stat > DOE > Modify Design to augment a factorial design already in your worksheet.

2. Have Minitab select an optimal design. With the full design in your worksheet, choose Stat > DOE > Response Surface > Select Optimal Design. Minitab will give you an optimal subset of runs based on the criteria you select.

If you need further help with this, please contact our free technical support line at 814-231-2682.

Thank you.

Minitab Technical Support0March 19, 2013 at 4:44 am #194907

MBBinWIParticipant@MBBinWI**Include @MBBinWI in your post and this person will**

be notified via email.@justinkd – one thing not mentioned thusfar is that the number of runs, while mathematically permissible, may not be practically useful. Depending upon the capability of your measurement system, you may or may not get adequate data for your analysis. Now, some might argue that if the effect isn’t sufficient to overcome the variability in your measurement system, then it’s probably not significant enough to worry about, but I would counter that I’ve seen enough measurement systems that are so bad that they cloud all results.

So, before you look to perform the minimal number of experimental runs permissible in any design to obtain the info you desire, first conduct a measurement system analysis and ensure that the data obtained is going to be sufficiently precise and accurate so as to allow you to make proper inferences from your DOE.0March 20, 2013 at 9:24 am #194912

Chris SeiderParticipant@cseider**Include @cseider in your post and this person will**

be notified via email.March 20, 2013 at 9:59 am #194913

MBBinWIParticipant@MBBinWI**Include @MBBinWI in your post and this person will**

be notified via email.@cseider – Atlas Shrugged is timeless (and very timely, to boot!)

0March 21, 2013 at 8:06 am #194915

Justin MParticipant@justinkd**Include @justinkd in your post and this person will**

be notified via email.Hello again

Sorry for the late response. So Im supposed to do more testing using software initially then do 3D printing of the low,mid and high results as a sort of validation.

So far I’ve used a L27 to only give me main effects for 7 factors at 3 levels.

When trying to create the worst and best situation Im using means (attached) but im not obtaining the least time. Could this be due to an interaction?0March 21, 2013 at 8:29 am #194916

Robert ButlerParticipant@rbutler**Include @rbutler in your post and this person will**

be notified via email.If your response is time and you’ve run a series of 27 experiments (which sounds like it allows you to test mains and curvilinear for all 7 factors) then the question is: what kind of a model do you get after running stepwise regression (both backward elimination and forward selection with replacement)?

Once you have that you can take the reduced model and run predictions with it to identify the combination of factors that give you the minimum time.

0March 21, 2013 at 8:38 am #194917

Justin MParticipant@justinkd**Include @justinkd in your post and this person will**

be notified via email.Okay, I’ll read up on that now

0March 21, 2013 at 7:55 pm #194920

Justin MParticipant@justinkd**Include @justinkd in your post and this person will**

be notified via email.Here’s what I got using minitab. So P represents sort of the significance of the factor and S is the deviation from the predicted value.

I did a general regression and got

Total Minutes = 1837.15 – 186.111 Slice Height – 167.833 Road Width – 102.222

Raster Angle – 102.278 Number of Contours – 107.167 Air Gap –

58.7222 STL Tolerance – 52.2778 STL Angle0March 22, 2013 at 5:44 am #194922

Robert ButlerParticipant@rbutler**Include @rbutler in your post and this person will**

be notified via email.It’s interesting that, in light of your belief that there might be curvilinear effects present, that none of them turned out to be significant. What does the residual analysis of this model show? If the residual plots indicate no missed trends then your model is suggesting that, over the range of the variables of interest, the minimum time you can expect is to be found at the maximum setting of the variables in the equation.

0March 22, 2013 at 12:49 pm #194932

Justin MParticipant@justinkd**Include @justinkd in your post and this person will**

be notified via email.So the random scatter means that a linear regression is fine then

0March 22, 2013 at 2:31 pm #194934

Robert ButlerParticipant@rbutler**Include @rbutler in your post and this person will**

be notified via email.If the scatter were random then it would suggest the regression line has adequately fit the data but the scatter isn’t random. The plot suggests you have 3 tails wagging the dog. The three right most data points are having an undue influence on your regression line and are dragging the fit away from the bulk of the data. Given their effect I’m surprised that the stepwise routine didn’t allow at least one of your squared terms into the model.

You should do the following:

1. Identify the runs corresponding to those three points and re-run the analysis with those three experiments excluded and see what you get.

2. Find out everything you can about how they were run with an eye to looking for things that may have changed in your process when those points were being run.

2a. To that end – if you kept a record of the time at which each experiment was run – plot the residuals against time and see if there are any interesting patterns with respect to the overall experimental effort and those three points in particular.3. Run a residual analysis on the model without those points present (by the way I would recommend an entry and stay p-value of .05 instead of .15)and see if it looks random.

3a. look at the predictions of the model based on the reduced data set and see if they point in a different direction.

There are other things to do but try these and we’ll see what we see.

0March 23, 2013 at 5:59 am #194935

Robert ButlerParticipant@rbutler**Include @rbutler in your post and this person will**

be notified via email.I was rereading this string of posts this morning and it appears we might be talking past one another. In your 8:06 AM post for 21 March you provided a means plot which gives a very stong impression of a curvilinear effect due to slice height and then in the body of the post you state “So far Ive used a L27 to only give me main effects for 7 factors at 3 levels” all of which gives the impression that you have not included curvilinear effects in your regression model building efforts. Is this the case or is it just a matter of the fog of internet communication?

0March 23, 2013 at 10:47 am #194937

Chris SeiderParticipant@cseider**Include @cseider in your post and this person will**

be notified via email.You should look at the 4 in 1 residual plots.

0March 23, 2013 at 12:51 pm #194940

Justin MParticipant@justinkd**Include @justinkd in your post and this person will**

be notified via email.Oh I should be looking at a curvilinear regression instead then.

0March 23, 2013 at 2:20 pm #194941

Robert ButlerParticipant@rbutler**Include @rbutler in your post and this person will**

be notified via email.Linear regression does not refer to the shape of the line it refers to the linearity of the parameters. A polynomial fit – that is a model with squared or higher terms – is still a linear regression.

You chose to run an L27. I don’t know the aliasing structure of this design but a quick Google search indicates it will allow a check of up to 13 variables for both main and squared effects. Therefore your initial model for stepwise regression should have included the terms x1-x7 and x1^2 – x7^2.

0March 23, 2013 at 4:40 pm #194942

Justin MParticipant@justinkd**Include @justinkd in your post and this person will**

be notified via email.Well not sure what to do now but here’s the stepwise run at 0.05 instead

0March 23, 2013 at 6:06 pm #194943

Robert ButlerParticipant@rbutler**Include @rbutler in your post and this person will**

be notified via email.The comment concerning the entry and stay value of .05 was a general observation. If you look at your first run all of the parameters were significant at less than .05 so, in this instance, there won’t be any difference between your first and second output.

What isn’t clear to me is the definition of the initial model you have been testing in Minitab. It should be a 14 term model; all of the simple linear effects (7 variables) and all of the squares of those variables (7 variables). If you have only been testing models with nothing but linear terms then before you do anything else, you need to first test a model with all of the terms your design permits. If you have done this and your reported model is the final result then you need to do the various things I mentioned in the 22 March 2:31 PM post above.

0March 23, 2013 at 6:26 pm #194944

Justin MParticipant@justinkd**Include @justinkd in your post and this person will**

be notified via email.It looks something like this

0March 23, 2013 at 6:45 pm #194945

Robert ButlerParticipant@rbutler**Include @rbutler in your post and this person will**

be notified via email.Ok, so all you are doing is running C1-C7 against Y. What you need to do is build the squared terms. I did a quick look up on Google and for regression it looks like when you open the window C1-C7 will be listed in the left most window and the response will be listed in a window labeled “Response”. It looks like you type in the names of your variables and then, to get the squared terms (quadratic effects) you type in the name of the variable followed by an asterisk followed by the variable name. For example: Slice Height*Slice Height would be the square term for slice height.

The web site I’m looking at is this one

http://www.minitab.com/en-US/training/articles/articles.aspx?id=8596&langType=1033

The actual mechanics of doing this probably aren’t exactly as I’ve guessed but this is what you will need to do in order to set up a model with all of your linear and quadratic terms so that you can test it.

0March 23, 2013 at 7:27 pm #194947

Justin MParticipant@justinkd**Include @justinkd in your post and this person will**

be notified via email.I think this might be it, otherwise i can separate just the squared term

0March 24, 2013 at 5:28 am #194949

Robert ButlerParticipant@rbutler**Include @rbutler in your post and this person will**

be notified via email.Is the model you provided above the result of backward elimination or forward selection with replacement using all 14 terms or is it just a model where you only used the terms slice height and slice height*slice height? If it isn’t the end result of stepwise regression then you need to run that and see what you get. The reason for the question is that if you look at your main effect plot it looks like there should be at least one other term in that model and perhaps as many as three more.

If you’re not sure about stepwise you might want to review how it’s done. You could check the site below for guidance.

http://sites.stat.psu.edu/~lsimon/stat501wc/sp05/minitab/stepwise.html

If what you have is the final result of a stepwise effort then it is easy to see who your three culprits are with respect to tails wagging the dog.

You should still run the residual plots as noted by Chris and myself. You should also:

1. Re-run the analysis without including those three points.

2. Find out everything you can about them since they are having an undue influence on your results.0March 24, 2013 at 10:13 am #194954

Justin MParticipant@justinkd**Include @justinkd in your post and this person will**

be notified via email.With the 3 points removed here is the stepwise and the regression

0March 24, 2013 at 11:06 am #194955

Robert ButlerParticipant@rbutler**Include @rbutler in your post and this person will**

be notified via email.As you can see – removing those three points changes everything. Note the difference in the reported statistics for your 7 term regression in your most recent post when compared to your 7 term regression in the post of 21 March @7:55. Also notice the difference in the residual pattern between this post and the post of 22 March @12:49.

The headings in your reports concerning the stepwise indicate you have not allowed the machine to look at all 14 terms with all of the data or with the reduced data set. You need to do both of these things and, since the three data points are so influential, you need to investigate them to see if you can determine why they are so different.

It will be interesting to see what happens in both of these instances and compare them to the results of a model based on the reduced data set.

0March 31, 2013 at 12:38 pm #195001

Justin MParticipant@justinkd**Include @justinkd in your post and this person will**

be notified via email.Hello again,

I re-did all my experiments on software. What is typically used in this type of DOE? I was using a means plot for level optimisation and a S/N ratio for reduced variability. Anything else i should include? ANOVA maybe?0March 31, 2013 at 6:35 pm #195002

Robert ButlerParticipant@rbutler**Include @rbutler in your post and this person will**

be notified via email.I’m not trying to be offensive but your most recent post gives the impression that you didn’t bother to do any of the things I and others recommended. It isn’t a matter of “what is typically used in this type of DOE” it is a matter of what you want to do with the results from any kind of DOE.

You give the impression that you want to use the results to optimize the process therefore you want something that is going to give you an estimate of direction and magnitude of change to an output given that you are making a change to one or more inputs. The most efficient way to do this is to build a regression equation and, once you have confirmed that the equation is doing an adequate job of fitting your data (using all the methods of residual analysis that were discussed previously), use it to make predictions concerning your mean responses. For issues concerning variability you can use either your S/N ratios or you can use the Box-Meyer method to assess the impact of variable level settings on output variation and you combine the results from the assessment of means and assessment of variation to identify the variable level settings which will give you the best tradeoff between mean levels and output variation.

As it appears to stand, you still need to run the analysis with the 14 terms, identify the reduced model,run the residual analysis and address any issues concerning influential data points etc.,and press on from there.

0April 1, 2013 at 10:55 am #195012

Justin MParticipant@justinkd**Include @justinkd in your post and this person will**

be notified via email.Im sorry, got caught up in some other work for the time. I have a new experiment of 18 now using the same orthogonal array. Im doing both what you said with estimation of direction and magnitude of change to an output and variability. My problem is estimating if its a good fit, what I read is that it is subjective and give no indication of telling. Attached is the stepwise regression and a residual plot

0April 2, 2013 at 5:07 am #195017

Robert ButlerParticipant@rbutler**Include @rbutler in your post and this person will**

be notified via email.Some questions and some observations:

1. If you are using the same design (the L27) how is it that you only have 18 runs and not 27?

You have repeatedly insisted you are interested having three levels of your variables and you have stated you have 7 variables. As I’ve said before, this would mean an initial model with 14 terms – the 7 linear effects and the 7 squared terms. Your printout says you have only constructed a model with 7 terms and your residual plot suggests an overlooked curvilinear effect – notice how the predicted low and high values all have residuals greater than 0 and the mid-range predicted values all have residuals lower than 0.

If you take that residual plot and “stack” it up against the Y axis you would get a bar chart that looks like the one in the lower left hand corner – essentially a bimodal distribution – again confirming that you have some a curvilinear effect of some kind.

So, as before, the question is: what kind of a reduced model do you get when you start with an initial model of the form

Response = fn( x1-x7, x1^2 x7^2) ?

Your model, with linear effects only, that is just x1-x7, suggests, slice height, road width, and air gap are your big hitters and the greater their value the lower your predicted response.

On additional thing: If the observation order is, in fact, your run order and if you randomized that run order and if you have run a backward elimination on the 14 term model indicated above and if none of the curvilinear terms tested significant then that plot is telling you that you have an unaccounted for process factor that was varying over time.

0 - AuthorPosts

You must be logged in to reply to this topic.