iSixSigma

Regression and Residual Analalysis for the Non Statistician …

Six Sigma – iSixSigma Forums General Forums Methodology Regression and Residual Analalysis for the Non Statistician …

Viewing 7 posts - 1 through 7 (of 7 total)
  • Author
    Posts
  • #53433

    Kevin Clay
    Participant

    Do any of the Six Sigma Gurus or Statistician have a fun and educational example of why we use residual analysis to determine model adequac y in regression? I want to use this in my Green Belt class. I am having a difficult time transferring the knowledge to my students in this particular topic.

    Regards,
    Kevin …
    [email protected]
    http://www.sixsigmadsi.com

    0
    #190083

    Robert Butler
    Participant

    I’d recommend just making up a simple example and name the X and Y after variables known by your audience.

    You might want to try a couple of different versions:

    1. Set up about 20 points so that the relationship between X and Y is curvilinear and so that the simple linear fit of Y to X is significant then you can run the simple linear model – show everyone the summary statistics and then show a plot of the residuals against the predicted and highlight the curve. Refit with a squared term and let everyone see what happens to the summary statistics as well as the residual plot.

    2. Build another data set that consists of 19 closely grouped points and one extreme point. Run a simple linear regression and note the significance of the slope, the high R2, etc. and then show the residual plot. The single extreme will stand out like a sore thumb. Re-run the regression with the single extreme removed and show everyone how all of the summary stats collapse.

    0
    #190084

    Kevin Clay
    Participant

    Do you know of any data already generated that i can use? Maybe in Minitab Sample Data Folder?

    Regards,
    Kevin …
    [email protected]
    http://www.sixsigmadsi.com

    0
    #190085

    Dr Shaik
    Participant

    residual analysis was one of the most important vlidation of fitting regression equation. After running process, taking actuals and their estimated values and find the these difference values plotted on graphically and see were is it consistant or not?.

    0
    #190087

    Robert Butler
    Participant

    I’m not trying to be offensive but I find your reluctance to take the time to pull out a simple piece of graph paper and spend a few minutes making up the two data sets I recommended to be very disturbing.

    You say you are teaching a Green Belt class and you made reference to Minitab – this would suggest you have at least a Black Belt and some abilities with respect to using Minitab. Even if the assumption of a BB is incorrect the idea that you are teaching a class and cannot do this does not reflect well on your abilities either as a practitioner or as a teacher.

    Below are two data sets which will do what I mentioned – it took me all of 5 minutes to make them up and check them via my computer package. As a homework assignment – tell me what you find interesting about the residuals of the first data set for both the linear and the curvilinear models. Hint – it provides further ammo with respect to the need to look at the actual residual plots.

    x1 y1 – linear-curvilinear problem
    1 1
    1 1.1
    1 1.2
    3 1.1
    3 1.2
    3 1.3
    5 1.4
    5 1.5
    5 1.45
    7 1.2
    7 1.3
    7 1.35
    8 1.3
    8 1.35

    x1 y1 influential data point problem
    1 1.1
    1 2
    1 1.5
    2 2
    2 1.7
    2 1.2
    3 1.3
    3 1.6
    3 1.8
    7 4.2
    7 4

    0
    #190088

    Kevin Clay
    Participant

    Robert … Thank you for your input and comments. I think my requirements were misinterpreted as i quickly responded. I was looking for an extensive data set with a story behind it (like a file in Minitabs Sample Data Sets) something that is not generic (although i appreciate your work). I am presently using a generic data set much like the one that you had created.

    Regards,
    Kevin …

    0
    #190089

    Jonathon Andell
    Participant

    Think of residual analysis as a way to extract the maximum information possible from the data. If any of the readily available residual plots shows a non-random pattern, then something in the process is likely to be causing that pattern. Often the nature of the pattern gives strong “hints” as to what the cause is, and that leads to increased understanding of what makes the process tick. The cause may be as simple as a sampling hiccup, or residual analysis could lead to rather significant understanding of a lurking X variable.

    As to the “adequacy” of the model: while there is a statistical component to a model’s adequacy, there also is a practical component. If the prediction equation significantly improves your ability to predict and/or control the process outcomes, the practical aspect may outweigh the statistical.

    0
Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic.