2k or CCD
Six Sigma – iSixSigma › Forums › General Forums › Methodology › 2k or CCD
 This topic has 17 replies, 5 voices, and was last updated 4 years, 7 months ago by MBBinWI.

AuthorPosts

February 19, 2017 at 6:07 am #55575
Brendans14aParticipant@Brendans14a Include @Brendans14a in your post and this person will
be notified via email.Hi all,
I currently run 2k DoE’s a good bit… usually with 3 or 4 factors.
I always use centerpoints in the designs so I can evaluate for curvature (if required).
This leads to rough estimates about parameter levels to yield a target response.
This works well when a confirmatory sample run at the optimum parameter’s is carried out.
I have been looking at CCD (central composite designs) to replace the 2k DoE’s. From my research I don’t see why not. The plus side with CCD is the options for axial points… or any variation around the design I see fit.
My question is this:
Has anyone a good knowledge of 2k versus CCD?
What are the pros and cons of each?Many thanks.
Brendan.
0February 19, 2017 at 6:37 pm #200582
MBBinWIParticipant@MBBinWI Include @MBBinWI in your post and this person will
be notified via email.@Brendans14a – I assume by “2k” you mean a two level full or fractional factorial design. By adding centerpoints you can evaluate for curvature. You can even use this information to create a surface plot. The advantage of a CCD is that the “star points” extend outside the region of the corner points, which provides better understanding over the entire design space than a BoxBehnken type design. Generally speaking, if you have good information that the optimal solution is in the middle of the design space, then use a BoxBehnken design. If you don’t know or suspect that the optimal solution might be closer to the outside of the design space, then use a CCD.
Good luck.
0February 20, 2017 at 6:39 am #200586
Chris SeiderParticipant@cseider Include @cseider in your post and this person will
be notified via email.I would advise against doing a CCD immediately. Confirm the cause/effect of the experimental design and you can always attempt to augment with CCD points later if all factors were statistically significant, etc.
0February 20, 2017 at 2:27 pm #200597
MBBinWIParticipant@MBBinWI Include @MBBinWI in your post and this person will
be notified via email.@cseider – I assumed that if the number of factors was already down to 3 or 4 that a screening design had already been run to weed out inconsequential factors.
0February 20, 2017 at 4:50 pm #200598
Chris SeiderParticipant@cseider Include @cseider in your post and this person will
be notified via email.Hey my friend and colleague….I wasn’t considering your post when I gave my advice. :)
You give good advice, sir.0February 21, 2017 at 12:19 am #200599
SergeyParticipant@ssobolev Include @ssobolev in your post and this person will
be notified via email.Building on what already said..
If you run DOE on a regular basis you might suggest curvature during planning based on your experience. If it is highly expected, you may go directly to response surface designs. BoxBenhken can save you some amount of trials (for 3 factors – 15 trilas versus 20 in CCD). The big assumption here that optimal settings are sitting inside the cube for BoxBenhken. Also this design eliminates most extreme points (usually, not feasible – like +1, +1, +1). I believe it works well but didn’t encounter good investigation CCD vs. BB.Any ideas/links on that?
0February 21, 2017 at 5:55 am #200602
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.I’m not aware of any experimental design that assumes the optimal settings are inside the hypergeometric region of the X matrix. The usual procedure (at least my usual procedure) is that one builds the design, runs the analysis and the resulting equations will point in the direction of optimum settings. Sometimes these settings are inside the hypercube but more often the equations point to a region external to the initial design region. Once the direction has been identified I will build a second (smaller) design with the variable settings determined by the equations from the initial design.
0February 21, 2017 at 12:32 pm #200605
Brendans14aParticipant@Brendans14a Include @Brendans14a in your post and this person will
be notified via email.Folks,
Many thanks for you advice, all is taken onboard.
My typical DoE approach is as follows:
1) Run a screening DoE to investigate main effects and identify potential significant factors.
2) Run a 2k full factorial DoE on significant factors from 1) with centerpoints to investigate main and interaction effects.
3) If curvature is detected augment 2k to CCD (not Face points as the process is not restricted my limits) or use BBD if I suspect the optimum is between the factor highs and lows to investigate quadratic effects.
4) Check R^2 and ensure it is at least 80%. This is simply my opinion as the nature of the studies I carry out should be predictable. Hence, high R^2 values should go without saying. Anyone have any other opinions around this hypothesis??? Do any of you rely on the R^2 for models?
4) If no curvature is observed, determine optimum factor levels from 2k DoE and run a confirmatory study at the 95% CL & 95% R (i.e. approx. 60 samples for binomial distribution (c=0)) to prove factor levels are correct.On another note, which some might have opinions on:
What I like about surface responses is the contour plots….. A centerpoint of a plateau (large enough plateau) on a contour plot, to me, indicates an ideal situation for setpoints of factors to be picked as this is most likely to reduce variation in the response of interest….. Would this be a fair statement? This is my own personal taking from contour plots.Cheers folks.
0February 21, 2017 at 12:51 pm #200606
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.If you are looking for a region of minimum variance you will want to run a BoxMeyers analysis. This takes the results from your design and analyzes the output in terms of variance minimization instead of mean maximization. When you have the results of this analysis you can look at the equations describing mean shifts and the equations describing variance shifts and identify those settings that will give you the best trade off between variance minimization and means maximization.
0February 21, 2017 at 1:01 pm #200607
Brendans14aParticipant@Brendans14a Include @Brendans14a in your post and this person will
be notified via email.February 21, 2017 at 1:54 pm #200608
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.Here you go
If variance as a response from a DOE is your concern then the most efficient approach to the problem is the BoxMeyers method.
1. Use a resolution 4 or higher design to avoid confounding interaction and dispersion effects.
[Note: you can do this with a Res III design but you are now in the realm of gambling. If Res III is all you can get go ahead and run this analysis but keep in mind that you are confounding interaction and dispersion effects]2. Fit the best model to the data (that is Y = f(X1, X2, X3, etc)
3. Compute the residuals
4. For each level of each factor, compute the standard deviation of the residuals.
5. Compute the difference between the two standard deviations for each factor (that is the standard deviation for all of the + levels and the standard deviation of all of the – levels)
6. Rank the differences in #5
7. Use a Pareto chart, a normal probability plot, or the ratio of the log of the variance of the + to the variance of the – for each factor to determine the important dispersion effects.
For purposes of control
1. For factors with important dispersion effects, determine the settings for least variation. and set location effect factors to optimize the response average.
2. If a factor for minimizing variance is in conflict with a factor for optimizing location use a tradeoff study to determine which setting is most crucial to product quality.
From – Understanding Industrial Designed Experiments 4th Edition section 528 to 532 Schmidt and Launsby – 19970February 21, 2017 at 11:27 pm #200609
SergeyParticipant@ssobolev Include @ssobolev in your post and this person will
be notified via email.Some comments regarding your guidance list
In 3) it is important to note that CCD can go by augmenting 2k, but BBD will require completely new trials. For overall scheme in that case BBD seems not practical.In 4) number one you mention only RSqr. The target for it depends on industry, may vary from 50 to 100%. The more important thing is not to rely only on RSqr! Significance of the model and all factors and their interactions you will see with pvalues in ANOVA table as well as Lack of Fit, Pure Error and Total error are important for model adequacy. Use residuals analysis (graphicalwise may be enough – normal probability plot, histogram, run charts vs. fits and order) looking at some points exposing more than two standard deviations from fit. Based on all these parameters make a decision whether your model is ok.
Hope it helps.
0February 22, 2017 at 1:01 pm #200616
Brendans14aParticipant@Brendans14a Include @Brendans14a in your post and this person will
be notified via email.This is great information and I can only assume it took you some time to aquire this knowledge. This is not completely foreign language to me…. But before I practice I have decided to purchase this book you mentioned.
I like to delve in and be 100% sure of what I do, therefore, reading this article for myselfree is the way.This is an obvious mistake… how could I possibly augment 2k with BBD, both have a different base design which is not comparable. Thanks for pointing out, that could have been embarrassing.
Onto the R^2: this is a fair point, I am guessing it is possible to have R^2 values lower than 50% but have a pvalue for the lack of kit statistic greater than the significance level (I.e. fail to reject the null hypothesis), meaning the model is statistically fitted by the data gathered, but the variation in the factors making up the model are not severe (high or low) enough to yield a high R^2.
I may be raving a bit here.B.
0February 22, 2017 at 2:01 pm #200618
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.Check interlibrary loan first before deciding you have to make a purchase. Essentially, what I’ve done is copy the howto instructions from the book.
0February 22, 2017 at 6:41 pm #200623
MBBinWIParticipant@MBBinWI Include @MBBinWI in your post and this person will
be notified via email.@rbutler – I doubt that I’m teaching you anything new, but it is typical that the BoxBehnken type DOE is done when the optimal solution is thought to be within the design space, and not close to the corner points, as it only uses corner and center points. The Central Composite, on the other hand, uses “star” points that are outside of the corner points, thus providing some information on the design space at and beyond the extremes of the corner points. While any design can be extrapolated beyond the corners, with less certainty the further away you get, since CCD extends the experimental space, it also expands the area of interpolation.
0February 22, 2017 at 7:14 pm #200624
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.Some additional thoughts on the following points:
4) Check R^2 and ensure it is at least 80%. This is simply my opinion as the nature of the studies I carry out should be predictable. Hence, high R^2 values should go without saying. Anyone have any other opinions around this hypothesis??? Do any of you rely on the R^2 for models?
4) If no curvature is observed, determine optimum factor levels from 2k DoE and run a confirmatory study at the 95% CL & 95% R (i.e. approx. 60 samples for binomial distribution (c=0)) to prove factor levels are correct.Short version: #4 By itself – no, opinion by itself – no, other opinions – yes, by itself no.
Short version second #4 – you need to do more in the first #4 before trying anything in the second #4.First #4: You have missed a critical part of the practice of regression analysis – graphical analysis of the various relations between the individual X’s and the Y as well as evaluation of the residuals. You can’t properly judge the value of the regression if you don’t run a thorough residual analysis.
For example – you can have a fantastic R2 which, once you check the residual pattern, will prove to be of no value – the reverse is also true. I don’t know of any statistician who would pass judgement on a regression just by looking at R2. For proof try running a simple regression on the following x,y data pairs (1,.15), (1.1,.2), (2,.22), (1.4,.3), (1.1,.31), (1.6, .17), (1.3, .2), (2.1, .11), (1.8, .14), (8, .9).
With all 10 points you will get an R2 = .85 and p <.0001. Now run it again without the last x,y pair – poof!! R2 = .17. A check of the residuals in the first case would indicate that the whole regression was depedent on that 10th x,y pair. Essentially the entire data set consists of two points – a fuzzy one in the region of X from 12 and another data point at X = 8.
As for high R2 going without saying – again not true even if we ignore cases with influential data points. The fact of the matter is that many things you have to measure and examine with methods of regression are naturally noisy and there isn’t much you can do about it. As before, you have to plot the data and look at your regression in light of you plots.
Many years ago I ran an analysis for water contaminates. The R2 was 4%, however, the trend line was in the direction of increasing contamination over time and the slope of the regression line was significantly different from 0. What the analysis showed was that if the pollution level of the contaminate (cyanide) continued to increase at the rate identified by my regression the water would be rendered useless for consumption. The analysis was ignored and about 3 years later the level of the contamination reached the point where the water was not fit for human or animal consumption.
0February 23, 2017 at 5:56 am #200629
Chris SeiderParticipant@cseider Include @cseider in your post and this person will
be notified via email.Remember that one reason for lower R2 might be a poorly measuring device.
0February 24, 2017 at 10:19 am #200663 
AuthorPosts
You must be logged in to reply to this topic.