Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Stats MAD

From Leeroopedia


Knowledge Sources
Domains Online_Learning, Statistics
Last Updated 2026-02-08 16:00 GMT

Overview

MAD computes the Median Absolute Deviation of a data stream incrementally.

Description

This statistic calculates the median of the absolute differences between each data point and the overall median of the data. It is a robust measure of statistical dispersion that is less sensitive to outliers than standard deviation. The implementation updates both the median of the data and the median of the absolute deviations online, which means it approximates the batch MAD rather than computing it exactly.

Usage

Use MAD when you need a robust measure of variability that is resistant to outliers. This is particularly useful in anomaly detection, outlier identification, and data quality assessment where extreme values should not overly influence the measure of spread. MAD is often preferred over standard deviation when working with skewed or heavy-tailed distributions.

Code Reference

Source Location

Signature

class MAD(quantile.Quantile):
    def __init__(self):
        super().__init__(q=0.5)
        self.median = quantile.Quantile(q=0.5)

Import

from river import stats

I/O Contract

Inputs

Name Type Required Description
x numbers.Number Yes Value to update the statistic with

Outputs

Name Type Description
get() float Current median absolute deviation

Usage Examples

from river import stats

# Create MAD statistic
mad = stats.MAD()

X = [4, 2, 5, 3, 0, 4]

for x in X:
    mad.update(x)
    print(f"Value: {x}, MAD: {mad.get()}")

# Output:
# Value: 4, MAD: 0.0
# Value: 2, MAD: 2.0
# Value: 5, MAD: 1.0
# Value: 3, MAD: 1.0
# Value: 0, MAD: 1.0
# Value: 4, MAD: 1.0

# Comparing MAD with standard deviation for robustness
import numpy as np

# Dataset with outlier
data_with_outlier = [1, 2, 3, 4, 5, 100]

mad_robust = stats.MAD()
from river import stats as rstats
std = rstats.Var()

for x in data_with_outlier:
    mad_robust.update(x)
    std.update(x)

print(f"MAD: {mad_robust.get():.2f}")
print(f"Std Dev: {np.sqrt(std.get()):.2f}")
# MAD is much less affected by the outlier (100)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment