# Correlation v Causation

Six Sigma – iSixSigma Forums Old Forums General Correlation v Causation

Viewing 10 posts - 1 through 10 (of 10 total)
• Author
Posts
• #48116

annon
Participant

Are the following statements correct:
The only difference between claiming causation and correlation is how the data was derived.  Happenstance is less preferable, allowing us to claim correlation and at best, prediction.  Experimentally derived data allows us to claim causation and hence, influence.
Thanks!

0
#161132

Severino
Participant

I would think WHAT data was analyzed is equally important.  If you run an experiment and find correlation then your logic seems to imply that it must therefore be causation.  However, there may exist another factor with equally as strong or a stronger correlation which might be the cause or it could be as a result of an interaction.  Collecting the data in a controlled manner alone will not assure the difference between correlation or causation.  You have to make sure you are analyzing the correct input variables.

If you assume that this has been done correctly, then I think I might be inclined to agree with your statement…. but then again, who the f am I?

0
#161138

Erik L
Participant

Agree with the assessment.  One of the best tools that we have to heighten causation is the use of randomization in how we assign the units to the various levels we are experimenting in.  This is one way, especially in transactional areas, that we can begin to approach the power that DOE brings through random application of units and assessing the results via multiple linear regression.  It won’t have all of the elements that make DOE the ultimate tool for establishing causation, but it’s better than just taking observational data.  Additionally, consistency of the result across multiple studies could also boost any transfer functions established.
Regards,
Erik

0
#161140

Millie Tent
Participant

Nothing about statistics is intended to free the user from engaging his/her brain during application.
You can absolutely prove through a designed experiment that people with large ears are better at math than are people with small ears.

0
#161144

Robert Butler
Participant

The short answer is no. Regardless of the data, correlation does not guarantee causation and, (at the risk of starting this whole debate over again) causation by does not guarantee correlation.
Rather than repost what was said about the last part of the above statement you can read the thread started by the post below:
https://www.isixsigma.com/forum/showmessage.asp?messageID=113585
Data from an organized effort of inquiry, such as a design, holds the promise of developing correlations between X’s and Y’s where the trending associated with one X will not be confounded with the trend associated with another nor with a trend that could be associated with an unknown/uncontrolled variable. However, that association (correlation) cannot be viewed as causation until it is confirmed with further independent experimentation.
Correlations based on happenstance data have more problems that must be addressed but even if these problems are resolved the resulting correlation would, like its designed counterpart, require external confirmation before it could be viewed as describing a cause and effect relationship.
If further experimentation confirms the trending described by the correlation the resulting equation can be used for purposes of prediction/control and one can take the view that, over the range of the variables used to build the equation, the correlation is providing a useful description of the cause and effect relationship between the X’s and the Y’s.

0
#161146

annon
Participant

Wow…gaining and losing IQ points all in one post!  Just kidding.  Thanks for the info and the thread.  So then am I correct in saying:

Data collected through proper experimental design will invoke the required assumption of independence and provide bolder inferential spaces, resulting in a larger and more robust experimental range.
Regardless of the original data type used in the analysis, causation is tendered only after controlled expermentation has validated the predictive model and only within the confines of the predictive ranges.
Hence, experimentally derived data is highly preferable to begin with and down-right required when validating your mode.

0
#161148

annon
Participant

Thanks all.  Always informative.

0
#161149

annon
Participant

l.

0
#161875

Lee
Participant

To separate it out in my mind, I go back to a study I read about once (do not recall where anymore).  The study was of the birth rate in Chicago and was many years ago.  They studied all sorts of factors, and even took into account the 9 month lag time.  What they found was that the brith rate was highly correlated with the thickness of the ice on the Great Lakes.  Now we all know (I hope) that ice does not *cause* babies.  The causal factor was that as the lakes froze over, the shipping rates drop, more men were home, …
Just though I’d share what I use as memory aid to remember to be very cautious of correlation results, especially if one does not understand the basic principles behind what is being studied.

0
#161887

annon
Participant

Thanks!

0
Viewing 10 posts - 1 through 10 (of 10 total)

The forum ‘General’ is closed to new topics and replies.