Principle:Pyro ppl Pyro Directional Statistics
| Knowledge Sources | |
|---|---|
| Domains | Directional Statistics, Circular Data, Spatial Statistics |
| Last Updated | 2026-02-09 09:00 GMT |
Overview
Directional statistics provides probability distributions defined on circles, spheres, and other manifolds where standard Euclidean distributions are inappropriate.
Description
Many real-world phenomena produce angular or directional data: wind directions, protein dihedral angles, time-of-day patterns, compass headings, and orientations on the celestial sphere. Standard distributions like the Gaussian are unsuitable for such data because they do not respect the periodic topology of circular or spherical domains.
Directional statistics addresses this by defining distributions on the appropriate manifolds:
Von Mises distribution (circular normal): The most fundamental circular distribution, analogous to the Gaussian on the real line. It is defined on the circle [0, 2*pi) with parameters:
- mu: the mean direction
- kappa: the concentration parameter (higher kappa means more concentrated around mu)
Bivariate von Mises distribution (sine model): Extends the von Mises to pairs of angles (e.g., protein backbone dihedral angles phi and psi). The sine model variant uses a sine-based dependence structure between the two angles, parameterized by a correlation term that captures how the two circular variables co-vary.
Sine-skewed distributions: A general construction that introduces asymmetry (skewness) into any symmetric circular or toroidal distribution. Given a base symmetric distribution on the circle, the sine-skewed version multiplies the density by a factor (1 + skewness * sin(theta - mu)), which tilts the distribution in one direction while preserving normalization.
These distributions are essential in structural biology (protein structure prediction), meteorology (wind patterns), ecology (animal movement), and any domain where periodicity is fundamental to the data.
Usage
Use directional distributions when:
- Modeling angular data such as wind direction, compass bearing, or phase angles.
- Working with protein dihedral angles (Ramachandran plots) in structural biology.
- Analyzing periodic time-series data (e.g., time-of-day, day-of-year effects).
- Building models where variables inherently live on circles or tori.
- Needing to capture asymmetric patterns in circular data (sine-skewed variants).
Theoretical Basis
The von Mises distribution on the circle:
# Von Mises density on [0, 2*pi)
p(theta | mu, kappa) = exp(kappa * cos(theta - mu)) / (2 * pi * I_0(kappa))
# where I_0(kappa) is the modified Bessel function of the first kind, order 0
# mu in [0, 2*pi): mean direction
# kappa >= 0: concentration (kappa=0 gives uniform, kappa->inf gives point mass)
The bivariate von Mises (sine model):
# Sine model for bivariate circular data
p(phi, psi | mu_1, mu_2, kappa_1, kappa_2, rho) =
C(kappa_1, kappa_2, rho)^{-1}
* exp(kappa_1 * cos(phi - mu_1)
+ kappa_2 * cos(psi - mu_2)
+ rho * sin(phi - mu_1) * sin(psi - mu_2))
# C is the normalizing constant
# rho controls the circular-circular correlation
# When rho=0, phi and psi are independent von Mises
The sine-skewed construction:
# Sine-skewed distribution
# Given a base symmetric circular distribution f(theta | mu):
p_skew(theta | mu, lambda) = f(theta | mu) * (1 + lambda * sin(theta - mu))
# lambda in [-1, 1]: skewness parameter
# lambda=0 recovers the symmetric base distribution
# This construction preserves normalization:
# integral of sin(theta - mu) * f(theta | mu) dtheta = 0 for symmetric f
For the bivariate case, sine-skewing generalizes to:
# Bivariate sine-skewed distribution
p_skew(phi, psi) = f(phi, psi) * (1 + lambda_1 * sin(phi - mu_1)
+ lambda_2 * sin(psi - mu_2))
# with constraint: |lambda_1| + |lambda_2| <= 1