Principle:Pyro ppl Pyro Markov Dependency

Metadata

Field	Value
Principle ID	Pyro_ppl_Pyro_Markov_Dependency
Title	Markov Dependency Structure
Project	Pyro (pyro-ppl/pyro)
Domains	Sequential_Models, Discrete_Inference
Implementation	Pyro_ppl_Pyro_Pyro_Markov
Repository	https://github.com/pyro-ppl/pyro

Summary

Markov Dependency is the principle of declaring finite-history dependency structure in sequential probabilistic models to enable memory-efficient inference. By annotating loops with Markov dependency information, Pyro's inference engine can safely discard old states during enumeration, reducing what would be exponential memory complexity to polynomial complexity.

Motivation

Sequential probabilistic models -- such as Hidden Markov Models (HMMs), linear dynamical systems, and time-series models -- have a natural temporal structure where each state depends only on a finite number of previous states. Without explicit Markov annotations, when Pyro enumerates discrete variables in a sequence of length T with K possible values per step, it would need to track all K^T possible configurations simultaneously, leading to exponential memory consumption.

The Markov dependency declaration tells Pyro's enumeration machinery that it only needs to maintain a sliding window of recent states. For a first-order Markov model (history=1), this reduces memory from O(K^T) to O(T * K^2), making inference on long sequences tractable.

Core Concepts

Markov Property

A sequence of random variables X_1, X_2, ..., X_T satisfies the first-order Markov property if:

P(X_t | X_1, ..., X_{t-1}) = P(X_t | X_{t-1})

More generally, an n-th order Markov property means:

P(X_t | X_1, ..., X_{t-1}) = P(X_t | X_{t-n}, ..., X_{t-1})

In Pyro, this is expressed by setting history=n in pyro.markov.

Memory Arena Analogy

pyro.markov functions as a statistical memory management arena. Just as a memory arena in systems programming allows efficient allocation and deallocation of memory blocks, the Markov context manager allows Pyro's inference engine to efficiently manage which sample sites are "in scope" for dependency tracking:

When entering a new Markov step, the current step's sites become visible
Sites from steps older than the history window are dropped from scope
This scoping mechanism tells the EnumMessenger which enumeration dimensions can be contracted (summed out) at each step

Complexity Analysis

Configuration	Memory Complexity	Description
No Markov annotation	O(K^T)	All possible sequences tracked simultaneously
`pyro.markov(history=1)`	O(T * K^2)	Only pairwise transitions tracked
`pyro.markov(history=2)`	O(T * K^3)	Triplet dependencies tracked
`pyro.markov(history=n)`	O(T * K^(n+1))	(n+1)-gram dependencies tracked

The keep Parameter

The keep parameter controls branching behavior:

keep=False (default): When exiting a Markov context, the frame is dropped from the stack. This means neighboring branches at the same level are independent (conditioned on shared ancestors). This is the common case for sequential models.
keep=True: Frames remain replayable after exiting. This allows neighboring branches at the same level to depend on each other. Useful for tree-structured models where sibling nodes share information.

How It Works

The Markov dependency mechanism operates through the MarkovMessenger:

When iterating over a pyro.markov-wrapped range, the messenger enters and exits a context for each iteration step
On each step, it maintains a stack of sets of site names that have been sampled
When a new sample site is encountered, the messenger annotates its infer dict with a _markov_scope counter that records which sites from recent history (within the history window) are in scope
The EnumMessenger reads these scope annotations to determine which enumeration dimensions can be safely contracted at each step
After exiting a step (with keep=False), the oldest frame is popped from the stack, freeing the associated enumeration dimensions

Example

import torch
import pyro
import pyro.distributions as dist
from pyro.infer import SVI, TraceEnum_ELBO, config_enumerate
from pyro.optim import Adam

# First-order Hidden Markov Model
@config_enumerate
def hmm_model(data, hidden_dim=5):
    # Transition and emission parameters
    transition = pyro.sample(
        "transition",
        dist.Dirichlet(torch.ones(hidden_dim, hidden_dim)).to_event(1)
    )
    emission_locs = pyro.sample(
        "emission_locs",
        dist.Normal(torch.arange(float(hidden_dim)), 1.0).to_event(1)
    )

    # Initial state
    state = 0
    # pyro.markov declares first-order Markov structure
    for t in pyro.markov(range(len(data))):
        state = pyro.sample(
            "state_{}".format(t),
            dist.Categorical(transition[state]),
            infer={"enumerate": "parallel"}
        )
        pyro.sample(
            "obs_{}".format(t),
            dist.Normal(emission_locs[state], 1.0),
            obs=data[t]
        )

# Without pyro.markov: memory O(5^T) -- infeasible for T > 10
# With pyro.markov(history=1): memory O(T * 25) -- tractable for any T

Second-Order Markov Model

# Second-order HMM: state depends on two previous states
for t in pyro.markov(range(len(data)), history=2):
    # Now state_t can depend on state_{t-1} AND state_{t-2}
    state = pyro.sample(
        "state_{}".format(t),
        dist.Categorical(transition_fn(prev_state, prev_prev_state))
    )

Relationship to Other Principles

Pyro_ppl_Pyro_Enumeration_Configuration -- Markov annotations are most impactful when combined with enumeration. Without enumeration, Markov scoping has less effect.
Pyro_ppl_Pyro_Tensor_Variable_Elimination -- The TVE algorithm exploits Markov structure to determine optimal contraction orderings, contracting out old enumeration dimensions at each step.
Pyro_ppl_Pyro_Discrete_Posterior_Decoding -- The forward-backward and Viterbi algorithms used by infer_discrete rely on Markov structure for efficient computation.

Related Pages

Implemented By

Implementation:Pyro_ppl_Pyro_Pyro_Markov

References

Pyro HMM tutorial: https://pyro.ai/examples/hmm.html
Obermeyer et al., "Tensor Variable Elimination for Plated Factor Graphs", 2019 (https://arxiv.org/abs/1902.03210)

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment