Principle:Pyro ppl Pyro Execution Trace
| Knowledge Sources | |
|---|---|
| Domains | Probabilistic Programming, Program Analysis, Bayesian Inference |
| Last Updated | 2026-02-09 09:00 GMT |
Overview
An execution trace is a data structure that records every stochastic choice, observed value, and associated log-probability encountered during a single forward execution of a probabilistic program.
Description
When a probabilistic program executes, it makes a sequence of stochastic choices at named sample sites. Each site has an associated distribution, a sampled (or observed) value, and metadata such as log-probability, whether the site is observed, and any scaling factors.
The execution trace captures this entire record as a directed graph (or ordered dictionary) of trace nodes. Each node corresponds to a single effectful operation and contains:
- name: A unique string identifier for the site.
- type: Whether the node is a sample site or a param site.
- fn: The distribution object from which the value was drawn.
- value: The tensor value that was sampled or observed.
- log_prob: The log-probability (or log-density) of the value under the distribution, computed lazily.
- is_observed: A boolean flag indicating whether this site was conditioned on data.
- scale: A multiplicative factor applied to the log-probability (used for data subsampling).
- mask: A boolean tensor for selectively zeroing out log-probability contributions.
- cond_indep_stack: A record of the conditional independence context (plates) surrounding the site.
Traces serve multiple roles in inference:
- ELBO computation: The trace of the model and the trace of the guide are compared site-by-site to compute the Evidence Lower Bound.
- Importance weighting: The total log-weight of a trace is the sum of model log-probabilities minus guide log-probabilities across all sites.
- Gradient estimation: Trace structure determines which terms contribute to score function or pathwise gradient estimators.
- Diagnostics: Inspecting traces reveals the shape, support, and dependency structure of a model.
The trace is typically represented as a directed acyclic graph where edges encode the data-flow dependencies between sample sites, enabling graph-based variance reduction techniques.
Usage
Use execution traces when:
- Implementing or debugging inference algorithms that require access to individual sample sites and their log-probabilities.
- Computing custom loss functions that compare model and guide traces.
- Performing posterior predictive checks by replaying traces with modified values.
- Analyzing the dependency structure of a probabilistic program to apply Rao-Blackwellization.
Theoretical Basis
A probabilistic program P defines a joint distribution over latent variables z = (z_1, ..., z_n) and observations x:
# Joint distribution factorization from trace
p(x, z) = product over sites s in trace:
p_s(value_s | parents(s))
# where each site s has:
# value_s: the sampled or observed value
# parents(s): values of sites that s depends on
The trace records the full factorization. For variational inference with guide q:
# ELBO from paired model/guide traces
ELBO = sum over latent sites s:
model_trace[s].log_prob - guide_trace[s].log_prob
+ sum over observed sites s:
model_trace[s].log_prob
The graph structure of the trace enables Rao-Blackwellization:
# Graph-based variance reduction
# For each non-reparameterizable site z_i with score function grad:
# Only downstream costs matter:
grad_i = grad log q(z_i) * (sum over j in downstream(i): cost_j)
# where downstream(i) = sites reachable from i in the trace graph
The trace graph is constructed by tracking tensor provenance: site j is downstream of site i if the value of z_i flows (through any computation) into the distribution parameters of site j.