Principle:Pyro ppl Pyro Coalescent Process

Knowledge Sources	The Coalescent Gene Genealogies and the Coalescent Process Bayesian Inference of Phylogeny with Pyro
Domains	Population Genetics, Phylogenetics, Bayesian Inference
Last Updated	2026-02-09 09:00 GMT

Overview

Kingman's coalescent is a stochastic process that models the genealogical history of a sample of individuals by tracing lineages backward in time until they coalesce into a common ancestor.

Description

The coalescent process is the foundational model in population genetics for describing how a sample of gene copies relates to a common ancestor. Working backward in time from the present, pairs of lineages merge (coalesce) at random times determined by the population size.

Given a sample of n lineages from a population of effective size N_e, the process operates as follows:

Start with n lineages at the present.
While there are k > 1 lineages remaining, the time until the next coalescence event is exponentially distributed with rate C(k,2) / N_e, where C(k,2) = k(k-1)/2 is the number of possible pairs.
At each coalescence, a uniformly random pair of lineages merges into one ancestral lineage.
The process terminates when a single lineage (the most recent common ancestor) remains.

The coalescent is useful because it provides a likelihood function for observed genetic data given population parameters. By modeling the tree of coalescent times, one can perform Bayesian inference over:

Effective population size (constant or time-varying)
Migration rates between subpopulations
Selection coefficients acting on genetic variants
Demographic history (population bottlenecks, expansions)

In Pyro, the coalescent is represented as a distribution over coalescent times, which can be used as a prior in hierarchical Bayesian models of genetic data.

Usage

Use the coalescent process when:

Modeling genealogical relationships among sampled individuals from a population.
Inferring effective population size or demographic parameters from genetic sequence data.
Building phylogenetic models where branch lengths are governed by population-genetic processes.
Combining coalescent priors with mutation models for full Bayesian phylogenetics.

Theoretical Basis

The waiting time between coalescent events follows an exponential distribution:

# Coalescent waiting times
# Given k lineages and effective population size N_e:

# Rate of coalescence (any pair):
lambda_k = C(k, 2) / N_e = k * (k - 1) / (2 * N_e)

# Waiting time until next coalescence:
T_k ~ Exponential(rate=lambda_k)

# Expected waiting time:
E[T_k] = 2 * N_e / (k * (k - 1))

The full coalescent tree for n samples is characterized by times (T_n, T_{n-1}, ..., T_2):

# Joint density of coalescent times
# t = (t_n, t_{n-1}, ..., t_2) where t_k is the waiting time with k lineages

log p(t | N_e) = sum over k=2 to n:
    log(lambda_k) - lambda_k * t_k

# where lambda_k = k*(k-1) / (2*N_e)

# Total tree height (time to MRCA):
T_MRCA = sum over k=2 to n: t_k

For a variable population size N_e(t), the coalescent rate becomes time-dependent:

# Variable population size coalescent
# Intensity function:
Lambda(s, t) = integral from s to t of: C(k,2) / N_e(u) du

# Probability of no coalescence in [s, t]:
P(no event in [s,t]) = exp(-Lambda(s, t))

Related Pages

Implementation:Pyro_ppl_Pyro_CoalescentTimes

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment