Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Drift PageHinkley

From Leeroopedia


Knowledge Sources Domains Last Updated
River River Docs Continuous Inspection Schemes Online Machine Learning, Concept Drift Detection, Sequential Analysis 2026-02-08 16:00 GMT

Overview

Concrete tool for detecting concept drift using the Page-Hinkley sequential analysis test, which monitors cumulative deviations from the running mean and signals drift when a threshold is exceeded.

Description

The drift.PageHinkley class implements a two-sided CUSUM control chart for change detection in streaming data. It monitors the cumulative sum of deviations between observed values and their running mean (tracked via stats.Mean). The forgetting factor alpha applies exponential weighting to prevent old observations from dominating. Drift is detected when the difference between the cumulative sum and its historical minimum (for upward shifts) or maximum (for downward shifts) exceeds the threshold parameter.

The detector supports three modes: "up" for detecting increases only, "down" for decreases only, and "both" for either direction. Unlike ADWIN, Page-Hinkley does not provide warning detection -- only drift signals.

Usage

Import drift.PageHinkley when you need a lightweight, constant-memory drift detector for monitoring mean shifts in a scalar signal. It is particularly useful when you need directional control over drift detection.

Code Reference

Source Location

river/drift/page_hinkley.py:L7-L128

Signature

class PageHinkley(DriftDetector):
    def __init__(
        self,
        min_instances: int = 30,
        delta: float = 0.005,
        threshold: float = 50.0,
        alpha: float = 1 - 0.0001,  # 0.9999
        mode: str = "both",
    )

Import

from river import drift

Key Parameters

Parameter Type Default Description
min_instances int 30 Minimum number of observations before drift detection begins
delta float 0.005 Magnitude allowance parameter. Controls the minimum detectable change size
threshold float 50.0 Detection threshold (lambda). Drift is flagged when the cumulative test statistic exceeds this value
alpha float 0.9999 Forgetting factor for exponential weighting. Values closer to 1 give more weight to historical data
mode str "both" Direction of change to detect: "up" (increases), "down" (decreases), or "both"

I/O Contract

Inputs

Method Parameter Type Description
update x float A single numeric value (e.g., classification error, loss value)

Outputs

Property/Method Return Type Description
drift_detected bool True if drift was detected on the most recent update call

Usage Examples

Basic Drift Detection

import random
from river import drift

rng = random.Random(12345)
ph = drift.PageHinkley()

# Simulate a data stream with a distribution change at index 1000
data_stream = rng.choices([0, 1], k=1000) + rng.choices(range(4, 8), k=1000)

for i, val in enumerate(data_stream):
    ph.update(val)
    if ph.drift_detected:
        print(f"Change detected at index {i}, input value: {val}")
# Change detected at index 1006, input value: 5

Detecting Only Upward Shifts

from river import drift

ph_up = drift.PageHinkley(mode="up", threshold=30.0)

for i, val in enumerate(data_stream):
    ph_up.update(val)
    if ph_up.drift_detected:
        print(f"Upward shift detected at step {i}")

Tuning Sensitivity

from river import drift

# More sensitive (lower threshold, smaller delta)
sensitive_ph = drift.PageHinkley(threshold=20.0, delta=0.001, min_instances=10)

# Less sensitive (higher threshold, larger delta)
conservative_ph = drift.PageHinkley(threshold=100.0, delta=0.01, min_instances=50)

Monitoring Classification Errors

from river import drift, tree, datasets

model = tree.HoeffdingTreeClassifier()
ph = drift.PageHinkley()

for x, y in datasets.Elec2().take(5000):
    y_pred = model.predict_one(x)
    if y_pred is not None:
        error = int(y_pred != y)
        ph.update(error)
        if ph.drift_detected:
            print("Drift detected, consider resetting the model")
    model.learn_one(x, y)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment