Implementation:Online ml River Drift ADWIN
| Knowledge Sources | Domains | Last Updated |
|---|---|---|
| River River Docs Learning from Time-Changing Data with Adaptive Windowing | Online Machine Learning, Concept Drift Detection, Statistical Testing | 2026-02-08 16:00 GMT |
Overview
Concrete tool for detecting concept drift in a data stream using the ADWIN adaptive windowing algorithm, which maintains a variable-length window and tests for statistically significant changes in sub-window means.
Description
The drift.ADWIN class implements the ADWIN2 algorithm for change detection in streaming data. It maintains a variable-length window of recent values using a bucket-based compression scheme. At regular intervals (controlled by the clock parameter), it tests all possible partitions of the window for significant mean differences using the Hoeffding bound. When drift is detected, the oldest portion of the window is dropped.
Internally, the heavy computation is delegated to a C extension (adwin_c.AdaptiveWindowing) for performance. The class exposes properties for inspecting the current window state: width, estimation (mean), variance, total, and n_detections.
Usage
Import drift.ADWIN when you need a standalone drift detector to monitor a scalar signal (such as classification error, loss values, or any numeric metric) for distributional changes. It is also the default drift detector used internally by tree.HoeffdingAdaptiveTreeClassifier and forest.ARFClassifier.
Code Reference
Source Location
river/drift/adwin.py:L8-L135
Signature
class ADWIN(DriftDetector):
def __init__(
self,
delta=0.002,
clock=32,
max_buckets=5,
min_window_length=5,
grace_period=10,
)
Import
from river import drift
Key Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
delta |
float | 0.002 | Confidence parameter for the drift test. Lower values mean fewer false positives but slower detection |
clock |
int | 32 | How often to check for drift. 1 = every new data point; higher values speed up processing but increase detection delay |
max_buckets |
int | 5 | Maximum number of buckets of each size before merging (controls compression granularity) |
min_window_length |
int | 5 | Minimum sub-window length to consider during drift checks. Smaller values may increase false positives |
grace_period |
int | 10 | Number of data points to observe before beginning drift detection |
Internal Delegation
The class delegates core computation to adwin_c.AdaptiveWindowing, a C extension that handles the bucket-based storage and Hoeffding bound calculations for performance.
I/O Contract
Inputs
| Method | Parameter | Type | Description |
|---|---|---|---|
update |
x | float | A single numeric value to add to the window (e.g., 0/1 for correct/incorrect classification) |
Outputs
| Property/Method | Return Type | Description |
|---|---|---|
drift_detected |
bool | True if drift was detected on the most recent update call
|
width |
int | Current window size |
estimation |
float | Mean of the values in the current window |
variance |
float | Sample variance within the adaptive window |
total |
float | Sum of all values in the window |
n_detections |
int | Total number of drifts detected over the detector's lifetime |
Usage Examples
Basic Drift Detection
import random
from river import drift
rng = random.Random(12345)
adwin = drift.ADWIN()
# Simulate a data stream with a distribution change at index 1000
data_stream = rng.choices([0, 1], k=1000) + rng.choices(range(4, 8), k=1000)
for i, val in enumerate(data_stream):
adwin.update(val)
if adwin.drift_detected:
print(f"Change detected at index {i}, input value: {val}")
# Change detected at index 1023, input value: 4
Monitoring Classification Error
from river import drift
adwin = drift.ADWIN(delta=0.001)
# Feed binary error indicators (0 = correct, 1 = incorrect)
for i, error in enumerate(errors):
adwin.update(error)
if adwin.drift_detected:
print(f"Drift detected at step {i}")
print(f"Window size: {adwin.width}, Mean error: {adwin.estimation:.4f}")
Tuning Sensitivity
from river import drift
# More sensitive (more false positives, faster detection)
sensitive_adwin = drift.ADWIN(delta=0.01, clock=1)
# Less sensitive (fewer false positives, slower detection)
conservative_adwin = drift.ADWIN(delta=0.0001, clock=64)
Related Pages
- Principle:Online_ml_River_ADWIN_Drift_Detection
- Implementation:Online_ml_River_Tree_HoeffdingAdaptiveTreeClassifier
- Implementation:Online_ml_River_Forest_ARFClassifier
- Implementation:Online_ml_River_Drift_DriftRetrainingClassifier
- Environment:Online_ml_River_Python_Runtime_Environment
- Environment:Online_ml_River_Build_Toolchain
- Heuristic:Online_ml_River_ARF_Drift_Detection_Sensitivity