Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Online ml River Anomaly ThresholdFilter

From Leeroopedia


Knowledge Sources River River Docs
Domains Online Machine Learning, Anomaly Detection, Binary Classification
Last Updated 2026-02-08 16:00 GMT

Overview

Concrete tool for converting continuous anomaly scores into binary anomaly labels using a fixed threshold in the River library, wrapping any anomaly detector with optional protection against learning from flagged anomalies.

Description

The anomaly.ThresholdFilter class wraps any anomaly detector implementing the AnomalyDetector interface and adds a classify method that converts continuous scores to binary labels. It inherits from anomaly.base.AnomalyFilter.

The classification rule is simple: a score is classified as anomalous if it is greater than or equal to the specified threshold. When protect_anomaly_detector is enabled (default), the wrapped detector's learn_one is only called if the observation is classified as normal, preventing the detector from adapting to anomalous data.

The ThresholdFilter can also be used as part of a pipeline using the pipe operator (|), allowing it to filter anomalous observations before they reach a downstream supervised model.

Usage

Import and use anomaly.ThresholdFilter when:

  • You have a fixed score threshold for anomaly classification
  • You want to protect the underlying detector from learning on anomalies
  • You need to filter anomalous observations in a pipeline before a supervised model

Code Reference

Source Location

river/anomaly/filter.py, lines 8-107.

Signature

class ThresholdFilter(anomaly.base.AnomalyFilter):
    def __init__(
        self,
        anomaly_detector,
        threshold: float,
        protect_anomaly_detector=True,
    ):

Import

from river import anomaly
filter_model = anomaly.ThresholdFilter(
    anomaly_detector=anomaly.HalfSpaceTrees(),
    threshold=0.95
)

Parameters

Parameter Type Default Description
anomaly_detector AnomalyDetector (required) The wrapped anomaly detector instance.
threshold float (required) The score threshold at or above which an observation is classified as anomalous.
protect_anomaly_detector bool True If True, the anomaly detector is not updated when the score is classified as anomalous.

Methods

  • classify(score: float) -> bool -- Returns True if score >= threshold, False otherwise.
  • score_one(x: dict) -> float -- Delegates to the wrapped anomaly detector's score_one.
  • learn_one(x: dict) -> None -- Scores the observation; if protection is enabled and the observation is classified as anomalous, the detector is not updated.

I/O Contract

Inputs

Method Parameter Type Description
classify score float An anomaly score to classify.
score_one x dict A dictionary mapping feature names to numeric values.
learn_one x dict A dictionary mapping feature names to numeric values.

Outputs

Method Return Type Description
classify bool True if the score indicates an anomaly (score >= threshold), False otherwise.
score_one float The anomaly score from the wrapped detector.
learn_one None Updates the wrapped detector (conditionally, based on protection setting).

Usage Examples

Filtering anomalies in a time series pipeline:

from river import anomaly, datasets, metrics, time_series

dataset = datasets.WaterFlow()
metric = metrics.SMAPE()

period = 24  # 24 samples per day

model = (
    anomaly.ThresholdFilter(
        anomaly.GaussianScorer(
            window_size=period * 7,  # 7 days
            grace_period=30
        ),
        threshold=0.995
    ) |
    time_series.HoltWinters(
        alpha=0.3,
        beta=0.1,
        multiplicative=False
    )
)

time_series.evaluate(
    dataset,
    model,
    metric,
    horizon=period
)

Basic threshold classification with HalfSpaceTrees:

from river import anomaly, preprocessing, compose

model = anomaly.ThresholdFilter(
    anomaly_detector=compose.Pipeline(
        preprocessing.MinMaxScaler(),
        anomaly.HalfSpaceTrees(seed=42)
    ),
    threshold=0.95,
    protect_anomaly_detector=True
)

# Process observations
x = {'feature_a': 3.5, 'feature_b': 1.2}
score = model.score_one(x)
is_anomaly = model.classify(score)
model.learn_one(x)

print(f"Score: {score}, Anomaly: {is_anomaly}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment