Problem with Minitab: Lack of Fit

Six Sigma – iSixSigma Forums General Forums Tools & Templates Problem with Minitab: Lack of Fit

This topic contains 6 replies, has 2 voices, and was last updated by  Robert Butler 1 year ago.

Viewing 7 posts - 1 through 7 (of 7 total)
  • Author
  • #55934


    Hello everyone,
    I’m writing because I need help with minitab. I’ve performed a Central Composite Design with minitab. These are the anova results:

    Source P-Value
    Model 0,000
    Linear 0,000
    A 0,000
    B 0,000
    C 0,000
    Square 0,000
    A*A 0,798
    B*B 0,041
    C*C 0,018
    2-Way Interaction 0,000
    AB 0,000
    AC 0,000
    BC 0,000
    Lack-of-Fit 0,000
    Pure Error

    The p-value of lack of fit is less then 5%. Is this a problem? Does it mean only that the regression model is not linear? Is this not a problem in my case because I’ve performed a CCD and I have the quadratic terms, or is this a problem?

    Thank you very much.


    Robert Butler

    I’m not trying to be mean spirited or snarky but you appear to have done nothing more than build a central composite design, dump the results into a program and hit run. In short, you, or rather the machine, took some data and ran a regression.

    What you, as an investigator, need to do is run regression analysis and that requires far more than just looking at some summary statistics on a printout.

    The first order of business is plot the data. You should first look at a plot of residuals against predicted to see if there are any important trends/shapes. If all is well, and given that you have a significant lack of fit it probably isn’t, then your residual plot should look like a shotgun blast with no obvious trending/clustering/patterns.

    Next you should look at the residuals vs the levels of the independent variables and ask the same questions.

    Finally, you should look at the residuals in time order of experimental run to see if there are any trends. Pay particular attention to the way your replicate responses plot over time.

    If you have trends/shapes/clustering/influential data points/etc. then you will need to go back and investigate. In order to understand what your graphs are telling you you will need to get a good book on regression analysis and read the chapters on residual analysis.

    A couple of references – Regression Analysis – Draper and Smith
    Introduction to Linear Regression Analysis – Montgomery

    You should be able to get these books through inter-library loan.



    Thank you for the Response. The residuals seems good and in model summary I have:
    R-sq 98,55%
    R-sq(adj) 98,13%
    R-sq(pred) 97,71%


    Robert Butler

    The summary stats look great – the plots don’t. The normal plot, the residuals against predicted and the time order all appear to have some issues.

    1. The three extreme points to the right on the residual plot could be the tail wagging the dog – what does the analysis look like with those removed?
    2. The residual plot gives the sense of trending toward a funnel shape and given the size of the fitted values this isn’t too surprising. What happens if you log the Y value and re-run the analysis?
    3. As you move from 0 out to around 1,000,000 or so notice how, within a given interval, the pattern of residuals as you move along either has a number of clusters mostly above the 0 line or below the 0 line.
    4. Look at the observation order – something happened between about 10 and 30 – what’s the story on those data points?
    5. The normal probability plot looks good if all you look at is the overall trend but what bothers me is the lack of scatter above and below the normal line – particularly those points to the extreme right – they are all below the line – who are they and what happens when you run the analysis without them? Are they the ones with the worst fit?
    6. Which points have the worst fit? Find those and look at them and then re-run the analysis with those points removed and see what you see – again with graphs as well as summary statistics.



    I´ve tried all Solutions, in particular:
    -I`ve checked all the Responses (if i wrote the right values);
    -I’ve checked from 10-30 in observ order plot;
    -I`ve deleted the Point on the right of residual plot and normal probability plot;
    -I’ve analyzed: linear terms, linear+interaction, linear + square, full square;

    In all these cases the lack of fit is significance and the R, R-sq etc decrease.

    Maybe the CCD Chosen is wrong, this is a faced centered CCD. what do you think?

    Thank for your time.



    Maybe I should increase the model order?


    Robert Butler

    Ok, so who are the bad boys? That is, which points are driving the significance for lack-of-fit? What you need to do is look at the difference between actual and predicted for these point and identify them on the plots you have provided. Once you have identified them you need to decide if the poor fit to those points matters from the standpoint of what you are trying to do.

    If they have something in common (same level of one of the variables for example) then there might be something worth investigating by doing something like dropping them from the data set and re-running the analysis.

    On the other hand, if the difference between actual and predicted doesn’t matter from a physical standpoint then I’d note the fact of the poor fit to those points in my report and get on with whatever it is that I was trying to do.

    At the end of the day the real test of your equation is its ability to predict, within its margin of error, the performance of your system. If your confirmation runs demonstrate the value of the model to permit control/optimization/whatever then your done and you can move on to other things.

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic.