Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Pyro ppl Pyro Coalescent Process

From Leeroopedia
Revision as of 18:03, 16 February 2026 by Admin (talk | contribs) (Auto-imported from principles/Pyro_ppl_Pyro_Coalescent_Process.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Population Genetics, Phylogenetics, Bayesian Inference
Last Updated 2026-02-09 09:00 GMT

Overview

Kingman's coalescent is a stochastic process that models the genealogical history of a sample of individuals by tracing lineages backward in time until they coalesce into a common ancestor.

Description

The coalescent process is the foundational model in population genetics for describing how a sample of gene copies relates to a common ancestor. Working backward in time from the present, pairs of lineages merge (coalesce) at random times determined by the population size.

Given a sample of n lineages from a population of effective size N_e, the process operates as follows:

  1. Start with n lineages at the present.
  2. While there are k > 1 lineages remaining, the time until the next coalescence event is exponentially distributed with rate C(k,2) / N_e, where C(k,2) = k(k-1)/2 is the number of possible pairs.
  3. At each coalescence, a uniformly random pair of lineages merges into one ancestral lineage.
  4. The process terminates when a single lineage (the most recent common ancestor) remains.

The coalescent is useful because it provides a likelihood function for observed genetic data given population parameters. By modeling the tree of coalescent times, one can perform Bayesian inference over:

  • Effective population size (constant or time-varying)
  • Migration rates between subpopulations
  • Selection coefficients acting on genetic variants
  • Demographic history (population bottlenecks, expansions)

In Pyro, the coalescent is represented as a distribution over coalescent times, which can be used as a prior in hierarchical Bayesian models of genetic data.

Usage

Use the coalescent process when:

  • Modeling genealogical relationships among sampled individuals from a population.
  • Inferring effective population size or demographic parameters from genetic sequence data.
  • Building phylogenetic models where branch lengths are governed by population-genetic processes.
  • Combining coalescent priors with mutation models for full Bayesian phylogenetics.

Theoretical Basis

The waiting time between coalescent events follows an exponential distribution:

# Coalescent waiting times
# Given k lineages and effective population size N_e:

# Rate of coalescence (any pair):
lambda_k = C(k, 2) / N_e = k * (k - 1) / (2 * N_e)

# Waiting time until next coalescence:
T_k ~ Exponential(rate=lambda_k)

# Expected waiting time:
E[T_k] = 2 * N_e / (k * (k - 1))

The full coalescent tree for n samples is characterized by times (T_n, T_{n-1}, ..., T_2):

# Joint density of coalescent times
# t = (t_n, t_{n-1}, ..., t_2) where t_k is the waiting time with k lineages

log p(t | N_e) = sum over k=2 to n:
    log(lambda_k) - lambda_k * t_k

# where lambda_k = k*(k-1) / (2*N_e)

# Total tree height (time to MRCA):
T_MRCA = sum over k=2 to n: t_k

For a variable population size N_e(t), the coalescent rate becomes time-dependent:

# Variable population size coalescent
# Intensity function:
Lambda(s, t) = integral from s to t of: C(k,2) / N_e(u) du

# Probability of no coalescence in [s, t]:
P(no event in [s,t]) = exp(-Lambda(s, t))

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment