Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Avhz RustQuant Descriptive Statistics

From Leeroopedia


Knowledge Sources
Domains Statistics, Data_Analysis
Last Updated 2026-02-07 21:00 GMT

Overview

Descriptive statistics provide summary measures for numerical data, including central tendency (mean), dispersion (variance, standard deviation), shape (skewness, kurtosis), and order statistics (median, percentiles, range).

Description

Descriptive statistics in RustQuant are implemented through the Statistic trait, which is generic over floating-point types and implemented for Vec<f64>. This trait provides a comprehensive set of statistical measures for analyzing numerical data.

Central tendency measures:

  • Arithmetic mean: the sum of values divided by the count
  • Geometric mean: the nth root of the product of n values, useful for growth rates
  • Harmonic mean: the reciprocal of the arithmetic mean of reciprocals, useful for averaging rates

Dispersion measures:

  • Sample variance: uses Bessel's correction (divides by n-1) for unbiased estimation
  • Population variance: divides by n for the true population parameter
  • Sample standard deviation: square root of sample variance
  • Population standard deviation: square root of population variance
  • Covariance: measures the joint variability of two vectors
  • Correlation: normalized covariance (Pearson correlation coefficient)

Shape measures:

  • Skewness: measures asymmetry of the distribution, using the adjusted Fisher-Pearson formula with the n/((n1)(n2)) correction factor
  • Kurtosis: measures tail heaviness, computed as the excess kurtosis (subtracting 3) using the bias-corrected formula

Order statistics and quantiles:

  • Minimum and Maximum: extreme values
  • Median: the middle value (or average of two middle values for even-length vectors)
  • Percentile and Quantile: values at a given proportion through the sorted distribution, using linear interpolation between adjacent values
  • Interquartile range: difference between the 75th and 25th percentiles
  • Range: difference between maximum and minimum

All methods include input validation: empty vectors cause panics for most operations, and standard deviation requires at least two elements.

Usage

Use descriptive statistics when summarizing return distributions, computing risk metrics, performing data quality checks, or preprocessing data for further analysis. In quantitative finance, these measures are essential for portfolio analysis, risk assessment, and model calibration.

Theoretical Basis

Arithmetic mean: x¯=1ni=1nxi

Geometric mean: x¯g=(i=1nxi)1/n

Harmonic mean: x¯h=ni=1n1/xi

Sample variance: s2=1n1i=1n(xix¯)2

Covariance: Cov(X,Y)=1n1i=1n(xix¯)(yiy¯)

Pearson correlation: r=Cov(X,Y)sXsY

Skewness (adjusted): g1=n(n1)(n2)i=1n(xix¯s)3

Excess kurtosis (adjusted): g2=n(n+1)(n1)(n2)(n3)i=1n(xix¯s)43(n1)2(n2)(n3)

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment