A correlation is when two (or more) things share a relationship with one another. Consider the graph below:
In this graph, it is comparing people’s experience of nausea with their chocolate consumption (I’ve made this up). As you can see, whenever the chocolate consumption is high, the feelings of nausea is high, and when the chocolate consumption is low, the nausea is low. This is called a positive correlation. Now consider the next graph:
This graph is comparing the degree of exercise someone participates in with the degree of heart disease that person has (once again, made up). But this time, when the degree of exercise is low, the degree of heart disease is high, and when the degree of exercise is low. This is called a negative correlation.
From these graphs, it may be tempting to conclude that excess chocolate consumption causes nausea and a lack of exercise causes heart disease. But not so fast! Consider the next graph that compares ice cream sales with shark attacks (which is not made up!):
If the first two graphs were enough to conclude that excess chocolate causes nausea and a lack of exercise causes heart disease, then we would have to conclude that ice cream causes shark attacks. But that seems crazy! Things can be correlated with one another even though one didn’t cause the other, this is why a rule in science and philosophy is correlation does not entail causation.
Although this is correct, it is a mistake to dismiss any attempt at causation. What is usually required is more information or considering the most probable reasons for the correlation. After all, if there is a causal relationship, then there must be a correlation. It is for this reason that research (done correctly of course) produces no correlation between two things, then we can say that this inplies no causation between the two.
Walter Sinnott-Armstrong and Robert Fogelin (2013) put forward 4 possibilities when a correlation is discovered:
- A caused b
- B caused a
- Something(s) caused both
The correlation between ice cream sales and shark attacks can be discovered by considering these possibilities. When considering whether shark attacks cause ice cream sales or vice versa, we first need to know which happened first. We require ‘a’ to be prior to ‘b’ as well if we want to claim a causal relationship. Second, we need a mechanism, an explanation to how these things brought about the causation. Without either knowledge of which was happening first and no conceivable mechanism, causal relationship seems out.
Coincidence becomes increasingly unlikely when the size and scope of a strong correlation gets bigger and bigger. So this seems unlikely. What about something else causing them both? When do people eat more ice cream? In the summer. When do people get attacked by Sharks? Swimming at the beach. And when do we typically go to the beach? In the summer. So, a likely explanation for the correlation is that both increase during summer and decrease in the winter.
Returning to the correlation of nausea and chocolate consumption, given the graph’s demonstration of a such a powerful positive correlation, a coincidence although possible, seems unlikely. Could something else be causing the nausea and the chocolate consumption? Perhaps the chocolate is past it’s expiry date. Could the nausea be causing the increase in chocolate consumption? That seems wrong. We do not typically crave chocolate when feeling nauseous, in fact, we’d probably avoid it. The increase in chocolate could cause the nausea, because eating any food in excess would bring about nausea. That appears to be the most probable situation, unless we can demonstrate the nausea came before the chocolate consumption. And this can be done with further testing and control trials.
So, the message to take from this is that correlation alone does not prove causation. However, it is as much of a mistake to dismiss causation. After all, assuming coincidence or other factors is assuming some kind of cause. Hence, when faced with correlation, the next step is to consider the potential causes and test for them. There are many ways in science that this is done, but for our introductory purposes, simply considering the four possibilities and ruling them out or rendering them improbable is a good start.
Take a look at the graph on the negative correlation with exercise and heart disease, and consider the four possibilities to judge the most probable cause.
I highly recommend ‘What is this thing called science?’ by A. F Chalmers if you wish to learn more regarding this topic. And also ‘Understanding arguments’ by Walter Sinnott-Armstrong and Robert Fogelin. If you purchase these books via the link below, you are supporting this website. Thank you.
What is this thing called science?