iSixSigma

Correlation v Causation

Six Sigma – iSixSigma Forums Old Forums General Correlation v Causation

This topic contains 9 replies, has 6 voices, and was last updated by  annon 12 years, 1 month ago.

Viewing 10 posts - 1 through 10 (of 10 total)
  • Author
    Posts
  • #48116

    annon
    Participant

    Are the following statements correct:
    The only difference between claiming causation and correlation is how the data was derived.  Happenstance is less preferable, allowing us to claim correlation and at best, prediction.  Experimentally derived data allows us to claim causation and hence, influence.
    Thanks!

    0
    #161132

    Severino
    Participant

    I would think WHAT data was analyzed is equally important.  If you run an experiment and find correlation then your logic seems to imply that it must therefore be causation.  However, there may exist another factor with equally as strong or a stronger correlation which might be the cause or it could be as a result of an interaction.  Collecting the data in a controlled manner alone will not assure the difference between correlation or causation.  You have to make sure you are analyzing the correct input variables. 
     
    If you assume that this has been done correctly, then I think I might be inclined to agree with your statement…. but then again, who the f am I?

    0
    #161138

    Erik L
    Participant

    Agree with the assessment.  One of the best tools that we have to heighten causation is the use of randomization in how we assign the units to the various levels we are experimenting in.  This is one way, especially in transactional areas, that we can begin to approach the power that DOE brings through random application of units and assessing the results via multiple linear regression.  It won’t have all of the elements that make DOE the ultimate tool for establishing causation, but it’s better than just taking observational data.  Additionally, consistency of the result across multiple studies could also boost any transfer functions established.
    Regards,
    Erik
     

    0
    #161140

    Millie Tent
    Participant

    Nothing about statistics is intended to free the user from engaging his/her brain during application. 
    You can absolutely prove through a designed experiment that people with large ears are better at math than are people with small ears. 

    0
    #161144

    Robert Butler
    Participant

     The short answer is no. Regardless of the data, correlation does not guarantee causation and, (at the risk of starting this whole debate over again) causation by does not guarantee correlation.
    Rather than repost what was said about the last part of the above statement you can read the thread started by the post below:
    https://www.isixsigma.com/forum/showmessage.asp?messageID=113585
      Data from an organized effort of inquiry, such as a design, holds the promise of developing correlations between X’s and Y’s where the trending associated with one X will not be confounded with the trend associated with another nor with a trend that could be associated with an unknown/uncontrolled variable. However, that association (correlation) cannot be viewed as causation until it is confirmed with further independent experimentation. 
      Correlations based on happenstance data have more problems that must be addressed but even if these problems are resolved the resulting correlation would, like its designed counterpart, require external confirmation before it could be viewed as describing a cause and effect relationship.
      If further experimentation confirms the trending described by the correlation the resulting equation can be used for purposes of prediction/control and one can take the view that, over the range of the variables used to build the equation, the correlation is providing a useful description of the cause and effect relationship between the X’s and the Y’s. 

    0
    #161146

    annon
    Participant

    Wow…gaining and losing IQ points all in one post!  Just kidding.  Thanks for the info and the thread.  So then am I correct in saying:

    Data collected through proper experimental design will invoke the required assumption of independence and provide bolder inferential spaces, resulting in a larger and more robust experimental range.
    Regardless of the original data type used in the analysis, causation is tendered only after controlled expermentation has validated the predictive model and only within the confines of the predictive ranges.
    Hence, experimentally derived data is highly preferable to begin with and down-right required when validating your mode.

    0
    #161148

    annon
    Participant

    Thanks all.  Always informative.

    0
    #161149

    annon
    Participant

    l.

    0
    #161875

    Lee
    Participant

    To separate it out in my mind, I go back to a study I read about once (do not recall where anymore).  The study was of the birth rate in Chicago and was many years ago.  They studied all sorts of factors, and even took into account the 9 month lag time.  What they found was that the brith rate was highly correlated with the thickness of the ice on the Great Lakes.  Now we all know (I hope) that ice does not *cause* babies.  The causal factor was that as the lakes froze over, the shipping rates drop, more men were home, …
    Just though I’d share what I use as memory aid to remember to be very cautious of correlation results, especially if one does not understand the basic principles behind what is being studied.

    0
    #161887

    annon
    Participant

    Thanks!

    0
Viewing 10 posts - 1 through 10 (of 10 total)

The forum ‘General’ is closed to new topics and replies.