Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Pyro ppl Pyro Importance Sampling Inference

From Leeroopedia
Revision as of 17:24, 16 February 2026 by Admin (talk | contribs) (Auto-imported from principles/Pyro_ppl_Pyro_Importance_Sampling_Inference.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Monte Carlo Methods, Bayesian Inference, Importance Sampling
Last Updated 2026-02-09 09:00 GMT

Overview

Importance sampling approximates expectations under a target distribution by drawing samples from a proposal distribution and reweighting them, forming the basis for likelihood-free inference and model evidence estimation.

Description

Importance sampling is a fundamental Monte Carlo technique for estimating expectations under a distribution p(z|x) (the target) using samples from a different distribution q(z) (the proposal). Each sample receives an importance weight proportional to the ratio p(z, x) / q(z), correcting for the mismatch between proposal and target.

In probabilistic programming, importance sampling has several key variants:

Basic importance sampling: Draw samples from the prior (or a guide distribution), weight each sample by the likelihood, and use the weighted samples to approximate the posterior. This is simple but can be inefficient when the prior and posterior differ substantially.

Compiled Sequential Importance Sampling (CSIS): An amortized approach where a neural network is trained to produce good proposal distributions. The network takes observed data as input and outputs proposal parameters for each sample site in the program. This "compiles" an inference artifact that can perform fast importance sampling on new data without retraining.

Reweighted Wake-Sleep (RWS): An algorithm that combines ideas from variational inference and importance sampling. It alternates between:

  • Wake phase: Draw samples from the model (with importance weights) and use them to update the recognition network (guide).
  • Sleep phase: Draw samples from the recognition network and use them to update the model.

The reweighting in the wake phase uses self-normalized importance weights, making the updates more robust than standard wake-sleep.

Usage

Use importance sampling-based inference when:

  • You need unbiased estimates of the model evidence (marginal likelihood) for model comparison.
  • The model has a tractable joint density but intractable posterior.
  • You want asymptotically exact inference (unlike VI, which has approximation bias).
  • Using CSIS for fast amortized inference on models with simulation-based likelihoods.
  • Training deep generative models with RWS for tighter bound optimization.

Theoretical Basis

Basic importance sampling:

# Target: p(z | x) = p(x, z) / p(x)
# Proposal: q(z)
# Importance weight: w(z) = p(x, z) / q(z)

# Estimate of E_p[f(z|x)]:
# Draw z_1, ..., z_N ~ q(z)
# Unnormalized weights: w_i = p(x, z_i) / q(z_i)
# Self-normalized weights: w_bar_i = w_i / sum_j w_j

# E_p[f(z|x)] approx sum_i w_bar_i * f(z_i)

# Model evidence estimate:
# p(x) approx (1/N) * sum_i w_i

# Effective sample size:
# ESS = (sum_i w_i)^2 / sum_i w_i^2

CSIS (Compiled Sequential Importance Sampling):

# Learn a proposal q(z | x; phi) parameterized by neural network phi

# Training (compilation):
# For many simulated (x, z) pairs from the joint p(x, z):
#   Maximize E_{p(x,z)}[log q(z | x; phi)]
# This is equivalent to minimizing KL(p(z|x) || q(z|x; phi))

# Inference:
# Given new observation x*:
# Draw z_1, ..., z_N ~ q(z | x*; phi)
# Compute weights: w_i = p(x*, z_i) / q(z_i | x*; phi)
# Return weighted samples {(z_i, w_bar_i)}

Reweighted Wake-Sleep:

# Wake phase (update guide q):
# Draw z_1, ..., z_K ~ q(z | x; phi)
# Compute importance weights: w_k = p(x, z_k) / q(z_k | x; phi)
# Normalize: w_bar_k = w_k / sum_j w_j
# Update phi to maximize: sum_k w_bar_k * log q(z_k | x; phi)
# (importance-weighted maximum likelihood)

# Sleep phase (update model p):
# Draw z ~ p(z), then x ~ p(x | z; theta)
# Update theta to maximize: log p(x | z; theta)
# (standard maximum likelihood on fantasy data)

# The wake phase provides a tighter bound than standard VI:
# log p(x) >= E[log (1/K * sum_k w_k)]  (IWAE bound)
# >= E_q[log p(x,z) - log q(z|x)]       (standard ELBO)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment