Principle:Online ml River ADWIN Drift Detection
| Knowledge Sources | Domains | Last Updated |
|---|---|---|
| River River Docs Learning from Time-Changing Data with Adaptive Windowing | Online Machine Learning, Concept Drift Detection, Statistical Testing | 2026-02-08 16:00 GMT |
Overview
ADWIN (ADaptive WINdowing) is an adaptive windowing algorithm that detects concept drift by maintaining a variable-length window and comparing sub-window means using statistical bounds.
Description
ADWIN is one of the most widely used drift detection methods in online machine learning. It works by maintaining a window of recent data values (typically error indicators) and efficiently testing whether any partition of this window into two consecutive sub-windows reveals a statistically significant difference in their means. When such a difference is found, the oldest portion of the window is dropped, effectively "forgetting" data from before the detected drift point.
The key advantage of ADWIN over fixed-window approaches is that its window size adapts automatically: it grows during stable periods to accumulate statistical power, and shrinks when drift is detected to quickly adapt to the new distribution. This gives ADWIN mathematical guarantees on both false positive and false negative rates.
ADWIN uses a bucket-based compression scheme (from ADWIN2, the improved version presented in the original paper) that keeps memory usage logarithmic in the window size, making it practical for long-running streams.
Usage
Use ADWIN drift detection when:
- You need a drift detector with formal statistical guarantees on detection accuracy.
- You want a parameter-light method (primarily the confidence parameter ).
- You need to monitor error rates, accuracy, or any scalar metric for distributional changes.
- You are building drift-adaptive models (e.g., Hoeffding Adaptive Tree, Adaptive Random Forest) that use ADWIN internally.
Theoretical Basis
ADWIN maintains a sliding window of the most recent values. At each step, it checks whether any partition of into two consecutive sub-windows (older) and (newer) exhibits a statistically significant difference in means.
For sub-windows of sizes and with sample means and , the test checks:
where the threshold is derived from the Hoeffding bound:
where is the harmonic mean of the sub-window sizes, and is a corrected confidence parameter derived from the user-specified .
The algorithm pseudocode:
ADWIN(delta, clock, max_buckets):
Initialize window W as empty
For each new value x:
1. Add x to W (insert into appropriate bucket)
2. Update running statistics (total, variance, width)
3. Every 'clock' steps:
For each possible partition of W into (W0, W1):
Compute epsilon_cut from Hoeffding bound
If |mean(W0) - mean(W1)| >= epsilon_cut:
Remove oldest elements (shrink W to W1)
Signal drift_detected = True
Break
4. Compress buckets if count exceeds max_buckets
Bucket compression: Instead of storing every individual value, ADWIN groups values into exponentially growing buckets. Each bucket stores the count, sum, and variance of its contained values. Adjacent buckets of the same size are merged when the number of same-size buckets exceeds max_buckets. This yields memory complexity.
Properties:
- False positive rate: Bounded by at any time step.
- False negative rate: If the true means differ by more than , drift will be detected with high probability.
- Memory: where is the window length.
- Time per update: Amortized with the
clockparameter controlling check frequency.