Principle:Pyro ppl Pyro Markov Dependency
Metadata
| Field | Value |
|---|---|
| Principle ID | Pyro_ppl_Pyro_Markov_Dependency |
| Title | Markov Dependency Structure |
| Project | Pyro (pyro-ppl/pyro) |
| Domains | Sequential_Models, Discrete_Inference |
| Implementation | Pyro_ppl_Pyro_Pyro_Markov |
| Repository | https://github.com/pyro-ppl/pyro |
Summary
Markov Dependency is the principle of declaring finite-history dependency structure in sequential probabilistic models to enable memory-efficient inference. By annotating loops with Markov dependency information, Pyro's inference engine can safely discard old states during enumeration, reducing what would be exponential memory complexity to polynomial complexity.
Motivation
Sequential probabilistic models -- such as Hidden Markov Models (HMMs), linear dynamical systems, and time-series models -- have a natural temporal structure where each state depends only on a finite number of previous states. Without explicit Markov annotations, when Pyro enumerates discrete variables in a sequence of length T with K possible values per step, it would need to track all K^T possible configurations simultaneously, leading to exponential memory consumption.
The Markov dependency declaration tells Pyro's enumeration machinery that it only needs to maintain a sliding window of recent states. For a first-order Markov model (history=1), this reduces memory from O(K^T) to O(T * K^2), making inference on long sequences tractable.
Core Concepts
Markov Property
A sequence of random variables X_1, X_2, ..., X_T satisfies the first-order Markov property if:
- P(X_t | X_1, ..., X_{t-1}) = P(X_t | X_{t-1})
More generally, an n-th order Markov property means:
- P(X_t | X_1, ..., X_{t-1}) = P(X_t | X_{t-n}, ..., X_{t-1})
In Pyro, this is expressed by setting history=n in pyro.markov.
Memory Arena Analogy
pyro.markov functions as a statistical memory management arena. Just as a memory arena in systems programming allows efficient allocation and deallocation of memory blocks, the Markov context manager allows Pyro's inference engine to efficiently manage which sample sites are "in scope" for dependency tracking:
- When entering a new Markov step, the current step's sites become visible
- Sites from steps older than the
historywindow are dropped from scope - This scoping mechanism tells the
EnumMessengerwhich enumeration dimensions can be contracted (summed out) at each step
Complexity Analysis
| Configuration | Memory Complexity | Description |
|---|---|---|
| No Markov annotation | O(K^T) | All possible sequences tracked simultaneously |
pyro.markov(history=1) |
O(T * K^2) | Only pairwise transitions tracked |
pyro.markov(history=2) |
O(T * K^3) | Triplet dependencies tracked |
pyro.markov(history=n) |
O(T * K^(n+1)) | (n+1)-gram dependencies tracked |
The keep Parameter
The keep parameter controls branching behavior:
keep=False(default): When exiting a Markov context, the frame is dropped from the stack. This means neighboring branches at the same level are independent (conditioned on shared ancestors). This is the common case for sequential models.keep=True: Frames remain replayable after exiting. This allows neighboring branches at the same level to depend on each other. Useful for tree-structured models where sibling nodes share information.
How It Works
The Markov dependency mechanism operates through the MarkovMessenger:
- When iterating over a
pyro.markov-wrapped range, the messenger enters and exits a context for each iteration step - On each step, it maintains a stack of sets of site names that have been sampled
- When a new sample site is encountered, the messenger annotates its
inferdict with a_markov_scopecounter that records which sites from recent history (within thehistorywindow) are in scope - The
EnumMessengerreads these scope annotations to determine which enumeration dimensions can be safely contracted at each step - After exiting a step (with
keep=False), the oldest frame is popped from the stack, freeing the associated enumeration dimensions
Example
import torch
import pyro
import pyro.distributions as dist
from pyro.infer import SVI, TraceEnum_ELBO, config_enumerate
from pyro.optim import Adam
# First-order Hidden Markov Model
@config_enumerate
def hmm_model(data, hidden_dim=5):
# Transition and emission parameters
transition = pyro.sample(
"transition",
dist.Dirichlet(torch.ones(hidden_dim, hidden_dim)).to_event(1)
)
emission_locs = pyro.sample(
"emission_locs",
dist.Normal(torch.arange(float(hidden_dim)), 1.0).to_event(1)
)
# Initial state
state = 0
# pyro.markov declares first-order Markov structure
for t in pyro.markov(range(len(data))):
state = pyro.sample(
"state_{}".format(t),
dist.Categorical(transition[state]),
infer={"enumerate": "parallel"}
)
pyro.sample(
"obs_{}".format(t),
dist.Normal(emission_locs[state], 1.0),
obs=data[t]
)
# Without pyro.markov: memory O(5^T) -- infeasible for T > 10
# With pyro.markov(history=1): memory O(T * 25) -- tractable for any T
Second-Order Markov Model
# Second-order HMM: state depends on two previous states
for t in pyro.markov(range(len(data)), history=2):
# Now state_t can depend on state_{t-1} AND state_{t-2}
state = pyro.sample(
"state_{}".format(t),
dist.Categorical(transition_fn(prev_state, prev_prev_state))
)
Relationship to Other Principles
- Pyro_ppl_Pyro_Enumeration_Configuration -- Markov annotations are most impactful when combined with enumeration. Without enumeration, Markov scoping has less effect.
- Pyro_ppl_Pyro_Tensor_Variable_Elimination -- The TVE algorithm exploits Markov structure to determine optimal contraction orderings, contracting out old enumeration dimensions at each step.
- Pyro_ppl_Pyro_Discrete_Posterior_Decoding -- The forward-backward and Viterbi algorithms used by
infer_discreterely on Markov structure for efficient computation.
Related Pages
Implemented By
References
- Pyro HMM tutorial: https://pyro.ai/examples/hmm.html
- Obermeyer et al., "Tensor Variable Elimination for Plated Factor Graphs", 2019 (https://arxiv.org/abs/1902.03210)