Implementation:Online ml River Rust Stats
| Knowledge Sources | |
|---|---|
| Domains | Online_Learning, Statistics, Rust, Performance_Optimization |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
Rust-powered statistical estimators providing high-performance alternatives to Python implementations.
Description
Implements performance-critical statistical functions in Rust via PyO3 bindings. Includes quantile estimation, exponentially weighted mean/variance, interquartile range (IQR), kurtosis, peak-to-peak, and skewness. Uses the watermill library for statistical algorithms and bincode for serialization. Provides significant speedups over pure Python implementations for compute-intensive statistics.
Usage
Use automatically when importing River's stats module - River selects Rust implementations when available. Manual usage possible but typically unnecessary as River's API handles this transparently. Particularly beneficial for high-frequency streaming scenarios.
Code Reference
Source Location
- Repository: Online_ml_River
- File: rust_src/lib.rs
Signature
#[pyclass(module = "river.stats._rust_stats")]
pub struct RsQuantile { ... }
#[pyclass(module = "river.stats._rust_stats")]
pub struct RsEWMean { ... }
#[pyclass(module = "river.stats._rust_stats")]
pub struct RsEWVar { ... }
#[pyclass(module = "river.stats._rust_stats")]
pub struct RsIQR { ... }
#[pyclass(module = "river.stats._rust_stats")]
pub struct RsKurtosis { ... }
#[pyclass(module = "river.stats._rust_stats")]
pub struct RsPeakToPeak { ... }
#[pyclass(module = "river.stats._rust_stats")]
pub struct RsSkew { ... }
#[pyclass(module = "river.stats._rust_stats")]
pub struct RsRollingQuantile { ... }
#[pyclass(module = "river.stats._rust_stats")]
pub struct RsRollingIQR { ... }
Import
# Typically imported automatically through River
from river import stats
# Rust implementation used transparently
quantile = stats.Quantile(0.5)
I/O Contract
| Struct | Methods | Description |
|---|---|---|
| RsQuantile | new(q), update(x), get() | Quantile estimation |
| RsEWMean | new(alpha), update(x), get() | Exponentially weighted mean |
| RsEWVar | new(alpha), update(x), get() | Exponentially weighted variance |
| RsIQR | new(q_inf, q_sup), update(x), get() | Interquartile range |
| RsKurtosis | new(bias), update(x), get() | Kurtosis (tailedness measure) |
| RsPeakToPeak | new(), update(x), get() | Range (max - min) |
| RsSkew | new(bias), update(x), get() | Skewness (asymmetry measure) |
| RsRollingQuantile | new(q, window_size), update(x), get() | Rolling quantile |
| RsRollingIQR | new(q_inf, q_sup, window_size), update(x), get() | Rolling IQR |
Usage Examples
from river import stats
import time
# The Rust implementation is used automatically
quantile = stats.Quantile(0.95)
# Performance comparison (implicit)
import numpy as np
data = np.random.normal(0, 1, 100000)
start = time.time()
for x in data:
quantile.update(x)
elapsed = time.time() - start
print(f"Processed 100k samples in {elapsed:.3f}s")
print(f"95th percentile: {quantile.get():.4f}")
# Exponentially weighted statistics
ew_mean = stats.EWMean(alpha=0.1)
ew_var = stats.EWVar(alpha=0.1)
for x in [1, 2, 3, 4, 5]:
ew_mean.update(x)
ew_var.update(x)
print(f"EW Mean: {ew_mean.get():.2f}")
print(f"EW Variance: {ew_var.get():.2f}")
# Rolling statistics
rolling_q = stats.RollingQuantile(q=0.5, window_size=10)
for i in range(20):
rolling_q.update(i)
if i >= 9:
print(f"Rolling median at step {i}: {rolling_q.get():.1f}")
# Distribution shape measures
kurtosis = stats.Kurtosis()
skew = stats.Skew()
for x in np.random.normal(0, 1, 1000):
kurtosis.update(x)
skew.update(x)
print(f"Kurtosis: {kurtosis.get():.4f}")
print(f"Skewness: {skew.get():.4f}")