Principle:Online ml River Anomaly Quantile Filtering

Knowledge Sources	River River Docs
Domains	Online Machine Learning, Anomaly Detection, Adaptive Thresholding
Last Updated	2026-02-08 16:00 GMT

Overview

Adaptive binary classification wrapper that converts continuous anomaly scores to binary anomaly/normal labels using a dynamically computed quantile threshold, automatically adapting to changing score distributions.

Description

Anomaly Quantile Filtering addresses a fundamental limitation of fixed-threshold anomaly filtering: the score distribution of an anomaly detector may change over time as the detector learns, as the data distribution drifts, or as the nature of anomalies evolves. A fixed threshold that works well initially may become too aggressive or too lenient over time.

The quantile filter solves this by maintaining a streaming quantile estimate of the anomaly score distribution. Instead of classifying based on a fixed score value, it classifies an observation as anomalous if its score exceeds the q-th quantile of all scores observed so far. This means the threshold automatically adapts to the evolving score distribution.

For example, with q=0.95, the filter will flag approximately the top 5% of observations (by anomaly score) as anomalous, regardless of the absolute score values. This provides a more robust and adaptive anomaly detection strategy.

Like the fixed-threshold filter, the quantile filter supports anomaly detector protection: when enabled, the wrapped detector is not updated with observations classified as anomalous, preventing contamination of the normal model.

A key difference in the quantile filter's learning behavior: the quantile statistic itself is always updated with the score, even when the wrapped detector is protected. This ensures the quantile estimate remains accurate even as anomalies are observed.

Usage

Use anomaly quantile filtering when:

You do not know the appropriate fixed threshold in advance
The anomaly score distribution may shift over time
You want to flag a fixed proportion (e.g., top 5%) of observations as anomalous
You need an adaptive threshold that self-calibrates
You want to combine the filter with any anomaly detector (Half-Space Trees, OneClassSVM, etc.)

Theoretical Basis

Streaming quantile estimation:

The quantile filter uses River's stats.Quantile to maintain an online estimate of the q-th quantile of the score distribution. This uses an incremental quantile estimation algorithm that does not require storing all observed scores.

Classification rule:

classify(score):
    quantile_threshold = quantile_estimator.get()
    if quantile_threshold is None:
        return False        # no data yet; cannot classify
    if score >= quantile_threshold:
        return True         # anomaly
    else:
        return False        # normal

Note: When the quantile estimate is not yet available (no data observed), quantile.get() returns None, and the filter uses math.inf as the threshold, meaning no observation is classified as anomalous until sufficient data is collected.

Protected learning with quantile update:

LEARN_ONE(x):
    score = anomaly_detector.score_one(x)
    if protect_anomaly_detector AND classify(score) == True:
        # Do NOT update the detector
        pass
    else:
        anomaly_detector.learn_one(x)
    # Always update the quantile estimate
    quantile_estimator.update(score)

Adaptive behavior:

With q = 0.95, approximately 5% of observations will be classified as anomalous
The threshold automatically adjusts as the score distribution changes
Early observations may have an unstable threshold until enough scores are collected
The quantile estimate converges as more data is observed

Comparison with fixed threshold:

Property	Fixed Threshold	Quantile Threshold
Threshold value	Constant	Adapts to score distribution
Domain knowledge required	Yes (must know score range)	No (just specify desired quantile)
Adapts to drift	No	Yes
Anomaly fraction	Unpredictable	Approximately 1-q
Warm-up period	None	Needs some observations

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment