Principle:Pyro ppl Pyro Causal Effect Estimation
| Knowledge Sources | |
|---|---|
| Domains | Causal Inference, Variational Autoencoders, Treatment Effect Estimation |
| Last Updated | 2026-02-09 09:00 GMT |
Overview
Causal effect estimation using VAEs (CEVAE) combines variational autoencoders with causal inference to estimate individual treatment effects from observational data by learning latent confounders and the causal structure simultaneously.
Description
Estimating the causal effect of a treatment (e.g., a drug, a policy) on an outcome is one of the fundamental problems in science and decision-making. In randomized controlled trials, treatment assignment is independent of confounders, making causal estimation straightforward. But in observational data, treatment assignment may depend on confounders (variables that affect both treatment choice and outcome), leading to biased estimates.
The CEVAE (Causal Effect Variational Autoencoder) addresses this challenge by:
- Assuming latent confounders: There exist latent variables z that causally influence both the treatment assignment t and the outcome y. These confounders are not directly observed but can be inferred from proxy variables x.
- Generative model: Specifies the causal structure:
- z ~ prior (latent confounder)
- t | z ~ treatment assignment model (how confounders affect treatment)
- y | t, z ~ outcome model (how treatment and confounders affect outcome)
- x | z ~ proxy model (how confounders manifest in observed covariates)
- Inference network: A neural network that infers the latent confounder z from observed (x, t, y), enabling estimation of the counterfactual: "What would the outcome have been under a different treatment?"
- Treatment effect estimation: Once z is inferred, the individual treatment effect (ITE) is estimated as:
ITE(x) = E[y | t=1, z(x)] - E[y | t=0, z(x)]
The key insight is that by jointly learning the latent confounders and the outcome model, CEVAE can adjust for unobserved confounding that would bias naive estimates.
Usage
Use causal effect estimation with VAEs when:
- Estimating treatment effects from observational data where confounding is suspected.
- Latent confounders exist that are not directly measured but have observable proxies.
- Individual-level treatment effect estimates are needed (personalized medicine).
- The relationship between confounders, treatment, and outcome is complex and nonlinear.
- Standard causal methods (propensity scoring, instrumental variables) are insufficient.
Theoretical Basis
Causal framework (potential outcomes):
# For each individual i:
# Y_i(0): potential outcome under control (t=0)
# Y_i(1): potential outcome under treatment (t=1)
# Fundamental problem: we observe only one of Y_i(0), Y_i(1)
# Individual Treatment Effect:
# ITE_i = Y_i(1) - Y_i(0) (unobservable for any individual)
# Average Treatment Effect:
# ATE = E[Y(1) - Y(0)]
# Conditional ATE:
# CATE(x) = E[Y(1) - Y(0) | X = x]
CEVAE generative model:
# Causal graph: z -> {t, y, x}; t -> y
# Generative process:
# z ~ Normal(0, I) # latent confounder
# t | z ~ Bernoulli(sigmoid(f_t(z))) # treatment assignment
# y | t, z ~ Normal(f_y(t, z), sigma_y) # outcome
# x | z ~ p(x | z; theta_x) # proxy observations
# f_t, f_y: neural networks parameterizing the conditional distributions
# theta_x: parameters of the proxy model
# Joint distribution:
# p(x, t, y, z) = p(z) * p(t | z) * p(y | t, z) * p(x | z)
Inference and treatment effect estimation:
# Recognition model (encoder):
# q(z | x, t, y) = Normal(mu_enc(x, t, y), sigma_enc(x, t, y))
# ELBO:
# L = E_q[log p(x | z) + log p(t | z) + log p(y | t, z) + log p(z) - log q(z | x, t, y)]
# After training, estimate treatment effects:
# For a new individual with covariates x*:
# 1. Infer confounder: z* ~ q(z | x*, ...)
# 2. Predict potential outcomes:
# Y*(1) = f_y(t=1, z*)
# Y*(0) = f_y(t=0, z*)
# 3. ITE* = Y*(1) - Y*(0)
Addressing confounding bias:
# Without adjustment (naive):
# E[Y | T=1] - E[Y | T=0] -- biased if T depends on confounders
# With latent confounder adjustment:
# CATE(x) = E_z[E[Y|T=1,z] - E[Y|T=0,z] | x]
# = integral (f_y(1, z) - f_y(0, z)) * q(z | x) dz
# This is unbiased if:
# 1. z captures all confounders (no unmeasured confounding given z)
# 2. The model is correctly specified
# 3. Positivity: P(T=t | z) > 0 for all z and t