Heuristic:Online ml River ARF Drift Detection Sensitivity
| Knowledge Sources | |
|---|---|
| Domains | Online_Learning, Concept_Drift, Ensemble_Methods |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
Configuring ADWIN warning and drift detectors with a 10x delta ratio for smooth model adaptation in ARF.
Description
The Adaptive Random Forest (ARF) uses a dual-detector strategy per tree: a warning detector (ADWIN with delta=0.01) and a drift detector (ADWIN with delta=0.001). The warning detector fires first, triggering background model training. When the drift detector confirms actual drift, the active tree is replaced by the already-trained background tree. The 10x delta ratio between warning and drift thresholds is a key empirical finding from the ARF paper that enables smooth adaptation without restarting from scratch.
Usage
Apply this heuristic when configuring ARFClassifier or ARFRegressor drift detection parameters, or when building custom ensemble methods that need adaptive model replacement. Also relevant when tuning standalone ADWIN detectors via DriftRetrainingClassifier.
The Insight (Rule of Thumb)
- Action: Set warning detector delta 10x larger than drift detector delta.
- Values:
- Default: warning_detector=ADWIN(delta=0.01), drift_detector=ADWIN(delta=0.001)
- More sensitive: warning=ADWIN(delta=0.1), drift=ADWIN(delta=0.01)
- Less sensitive: warning=ADWIN(delta=0.001), drift=ADWIN(delta=0.0001)
- Trade-off: Larger delta = more false positive drift detections but faster reaction to real drift. Smaller delta = fewer false alarms but delayed response.
- Background model benefit: Warning detection spawns a background model trained on post-warning data. If drift confirms, the swap is immediate with no cold-start period.
Reasoning
The dual-detector strategy exploits the asymmetry between false warnings and false drift detections:
- False warnings are cheap: A background model is trained in parallel but discarded if drift isn't confirmed. The only cost is memory for the background tree.
- False drift detections are expensive: Swapping the active model unnecessarily loses learned knowledge. Hence, the drift detector must be more conservative (smaller delta).
- The 10x ratio: Empirically validated in the ARF paper (Gomes et al., 2017). The warning detector with delta=0.01 provides early signal, while the drift detector with delta=0.001 confirms with higher confidence. This ratio balances responsiveness with stability.
- ADWIN standalone defaults: ADWIN uses delta=0.002 by default when used as a standalone drift detector, which falls between the ARF warning and drift thresholds.
Code Evidence
ARF drift detector defaults from `river/forest/adaptive_random_forest.py:628-629`:
drift_detector=drift_detector or ADWIN(delta=0.001),
warning_detector=warning_detector or ADWIN(delta=0.01),
ADWIN standalone defaults from `river/drift/adwin.py:68`:
def __init__(self, delta=0.002, clock=32, max_buckets=5,
min_window_length=5, grace_period=10):
ADWIN clock parameter from `river/drift/adwin.py:68`:
# clock=32 means drift is checked every 32 samples
# Trade-off: clock=1 checks every sample (expensive);
# clock=32 reduces overhead with minimal detection delay