A Data Science Central Community
Another good article by Ajit Joakar.
Co-relation does not equal causation – is a mantra drilled into a Data Scientist from an early age
That’s fine. But very few talk of the follow-on question ..
How exactly do you determine causation?
This problem is further compounded because most books and examples are based on standard datasets (ex: Boston, Iris etc) . These examples do not discuss causation because the features chosen are already determined to be causal (ex the factors affecting house prices are chosen to be causal.) So, if we start from the beginning (without simplified examples) how do you know if a particular variable is a causal variable?
Firstly, causality cannot be determined from data alone. Data gives co-relation, but data alone cannot determine causation. To determine causation, we need to perform an experiment or a controlled study.