Causal Inference Primer
The goal of causal inference is to determine the existence/non-existance of causal relation between two sets of variables (causes and outcomes).
Before performing causal inference one has to determine the universe in which causal inference has to be performed.
This universe encapsulates the potential causes and outcomes we are interested in.
This universe can be described by a Probabilistic graphical model or structural causal model.
Probabilistic Graphical Models (PGMs) Represented by a directed graph . where each node is a random variable. Edge represents the dependence of on . represents the set of all conditional probabilities
In PGMs each note is treated as outputting a probability distribution of the random variables it represents, the nodes in SCMs on the other hand output a sample from the distribution of the random variable. SCMs can be represented by a triplet where, is the set of random variables, is a set of deterministic functions, is a set of exogenous(external) noise variables. Then sampling from a node can be written as.
Correlation, Independence and Causation
Dependence: Two random variables are independent iff
or else are dependent.
Correlation: Two random variables are correlated iff
if then are not correlated.
Causation: Random variable is said to be causing if there is an edge from to in the data-generating process(represented as a PGM or SCM). We will later define measures to identify the strength of this causal effect.
Correlation, dependence, and causation are not equivalent. In short,
- Two independent variables are uncorrelated, but two uncorrelated variables may not be independent.
- Correlation does not imply causation.
- The lack of correlation does not imply the lack of causation.
Confounders, colliders and mediators
- Statistical dependence between two random variables can exist even when they are not causally related.
- This section analyzes causal and statistical dependence between two variables in all possible atomic situations they can then be composed to determine causal or statistical dependence in more complex graphs.
D-seperationis the process of determining dependence between two variables in a larger PGM given a set of observed nodes.
observed Confounder

Since is observed are conditionally independent.
Also as seen from the SCM they also are not causally related.
is said to be closed(due to independence).
un-observed Confounder

Here are not independent, they are also not conditionally independent. They are dependent and causally not related.
is said to be open (due to dependence).
Because of this effect is called a confounder.
Observed collider

observed collider results in explaining-away effect.
are not independent, there is no causal relation between them as seen from SCM above.
is open.
Un-Observed collider

are independent.
No causal dependence.
is closed.
Observed Mediator

u and v are independent conditioned on w.
is closed as w is fixed and changing u does not change v anymore as w is fixed.
Therefore, independent and no causal dependence.
Un-Observed Mediator

Therefore are not independent.
Change in u will change the distribution/value of w which will propagate to v. So, u causally affects v.