Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Stats Var

From Leeroopedia


Knowledge Sources
Domains Online_Learning, Statistics
Last Updated 2026-02-08 16:00 GMT

Overview

Var computes the running variance of a data stream using Welford's algorithm.

Description

This statistic calculates variance incrementally as data arrives, maintaining numerical stability through Welford's method. Variance measures the spread of data around its mean. The implementation supports weighted observations, includes a revert method for rolling windows, and provides batch update capabilities. The ddof parameter controls degrees of freedom correction, with ddof=1 giving the sample variance (unbiased estimator).

Usage

Use Var when you need to measure the variability or spread of streaming data. Common applications include monitoring data quality, detecting anomalies, statistical process control, understanding feature distributions, normalization (standardization requires mean and standard deviation), and as a building block for computing standard deviation, standard error, and other higher-order statistics.

Code Reference

Source Location

Signature

class Var(stats.base.Univariate):
    def __init__(self, ddof=1) -> None:
        self.ddof = ddof
        self.mean = stats.Mean()
        self._S = 0

Import

from river import stats

I/O Contract

Inputs

Name Type Required Description
x numbers.Number Yes Value to update the statistic with
w float No Weight for the observation (default: 1.0)
ddof int Yes (init) Delta Degrees of Freedom (default: 1 for sample variance)

Outputs

Name Type Description
get() float Current variance (0.0 if n <= ddof)

Usage Examples

from river import stats

# Basic running variance
X = [3, 5, 4, 7, 10, 12]
var = stats.Var()

for x in X:
    var.update(x)
    print(f"Value: {x}, Variance: {var.get():.6f}")

# Output:
# Value: 3, Variance: 0.000000
# Value: 5, Variance: 2.000000
# Value: 4, Variance: 1.000000
# Value: 7, Variance: 2.916666
# Value: 10, Variance: 7.700000
# Value: 12, Variance: 12.566666

# Rolling variance
from river import utils

X = [1, 4, 2, -4, -8, 0]
rvar = utils.Rolling(stats.Var(ddof=1), window_size=3)

for x in X:
    rvar.update(x)
    print(f"Value: {x}, Rolling Var: {rvar.get():.6f}")

# Output:
# Value: 1, Rolling Var: 0.000000
# Value: 4, Rolling Var: 4.500000
# Value: 2, Rolling Var: 2.333333
# Value: -4, Rolling Var: 17.333333
# Value: -8, Rolling Var: 25.333333
# Value: 0, Rolling Var: 16.000000

# Computing standard deviation from variance
import math

variance = stats.Var()
data = [2, 4, 4, 4, 5, 5, 7, 9]

for x in data:
    variance.update(x)

print(f"Variance: {variance.get():.4f}")
print(f"Std Dev: {math.sqrt(variance.get()):.4f}")

# Monitoring data variability
quality_var = stats.Var()

# Normal operation
for x in [10.1, 10.2, 9.9, 10.0, 10.1]:
    quality_var.update(x)

print(f"Normal operation variance: {quality_var.get():.4f}")

# With anomaly
quality_var.update(15.0)
print(f"With anomaly variance: {quality_var.get():.4f}")

# Weighted variance
weighted_var = stats.Var()
weighted_var.update(10, w=1.0)
weighted_var.update(20, w=2.0)
weighted_var.update(30, w=3.0)
print(f"Weighted variance: {weighted_var.get():.4f}")

# Population vs sample variance
pop_var = stats.Var(ddof=0)  # Population variance
sample_var = stats.Var(ddof=1)  # Sample variance

data = [1, 2, 3, 4, 5]
for x in data:
    pop_var.update(x)
    sample_var.update(x)

print(f"Population variance: {pop_var.get():.4f}")
print(f"Sample variance: {sample_var.get():.4f}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment