Implementation:Online ml River Stats Skew

Knowledge Sources	Online_ml_River
Domains	Online_Learning, Statistics
Last Updated	2026-02-08 16:00 GMT

Overview

Skew computes the running skewness of a data stream using Welford's algorithm.

Description

This statistic measures the asymmetry of a probability distribution around its mean. It calculates the third standardized moment incrementally. Positive skewness indicates a distribution with a longer right tail (values concentrated on the left), while negative skewness indicates a longer left tail (values concentrated on the right). The implementation uses Rust for performance and supports both biased and unbiased estimators.

Usage

Use Skew when you need to understand the asymmetry of streaming data distributions. Common applications include detecting data distribution shifts, identifying non-normal distributions, financial risk analysis (return distributions), quality control, and feature engineering where distribution shape is informative. Skewness helps identify if data is balanced or leaning toward extreme values.

Code Reference

Source Location

Repository: Online_ml_River
File: river/stats/skew.py

Signature

class Skew(stats.base.Univariate):
    def __init__(self, bias=False):
        super().__init__()
        self.bias = bias
        self._skew = _rust_stats.RsSkew(bias)

Import

from river import stats

I/O Contract

Inputs

Name	Type	Required	Description
x	numbers.Number	Yes	Value to update the statistic with
bias	bool	Yes (init)	If False, calculations are corrected for statistical bias (default: False)

Outputs

Name	Type	Description
get()	float	Current skewness value (0 for symmetric distributions)

Usage Examples

from river import stats
import numpy as np

# Unbiased skewness
np.random.seed(42)
X = np.random.normal(loc=0, scale=1, size=10)

skew = stats.Skew(bias=False)
for x in X:
    skew.update(x)
    print(f"Skew: {skew.get():.4f}")

# Output (final values):
# 0.0000
# 0.0000
# -1.4802
# 0.5127
# 0.7803
# 1.0561
# 0.5058
# 0.3478
# 0.4537
# 0.4123

# Biased skewness
skew_biased = stats.Skew(bias=True)
for x in X:
    skew_biased.update(x)
    print(f"Biased Skew: {skew_biased.get():.4f}")

# Detecting right-skewed distribution
right_skew = stats.Skew()
# Data concentrated on left, long right tail
for x in [1, 2, 2, 3, 3, 3, 4, 4, 5, 10, 15]:
    right_skew.update(x)

print(f"Right-skewed data: {right_skew.get():.4f}")
# Positive value indicates right skew

# Detecting left-skewed distribution
left_skew = stats.Skew()
# Data concentrated on right, long left tail
for x in [1, 5, 10, 10, 11, 11, 11, 12, 12, 13]:
    left_skew.update(x)

print(f"Left-skewed data: {left_skew.get():.4f}")
# Negative value indicates left skew

# Symmetric distribution (normal-like)
symmetric_skew = stats.Skew()
for x in np.random.normal(0, 1, 1000):
    symmetric_skew.update(x)

print(f"Symmetric distribution skew: {symmetric_skew.get():.4f}")
# Close to 0 for symmetric distribution

# Comparing skewness of different features
feature_a_skew = stats.Skew()
feature_b_skew = stats.Skew()

# Feature A: uniformly distributed
for x in range(100):
    feature_a_skew.update(x)

# Feature B: exponentially distributed
for x in np.random.exponential(2, 100):
    feature_b_skew.update(x)

print(f"Uniform distribution skew: {feature_a_skew.get():.4f}")
print(f"Exponential distribution skew: {feature_b_skew.get():.4f}")

Related Pages

Environment:Online_ml_River_Python_Runtime_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment