Principle:Online ml River Drift Retraining
| Knowledge Sources | Domains | Last Updated |
|---|---|---|
| River River Docs | Online Machine Learning, Concept Drift, Meta-Learning | 2026-02-08 16:00 GMT |
Overview
Drift retraining is a meta-learning strategy that wraps a base classifier with a drift detector and automatically retrains the model from scratch when concept drift is detected.
Description
The drift retraining approach treats concept drift adaptation as a wrapper pattern: any base classifier can be augmented with drift detection and automatic retraining capabilities without modifying its internal implementation. The wrapper monitors the model's prediction errors by feeding them to a drift detector. When the detector signals a warning or a drift, the wrapper takes corrective action.
This strategy offers two operational modes:
- Background training (
train_in_background=True): When a warning is detected, a fresh clone of the base model starts training on incoming data in parallel. If the warning escalates to a confirmed drift, the background model replaces the primary model. This provides a smoother transition because the replacement model has already been partially trained on post-drift data.
- Immediate reset (
train_in_background=False): When drift is detected, the primary model is simply reset (cloned fresh) with no background pre-training. This is faster to react but may cause a temporary performance dip while the new model catches up.
The key trade-off is between adaptation speed (how quickly the model adjusts to the new concept) and stability (how much performance degrades during the transition).
Usage
Use drift retraining when:
- You want to add drift adaptation to any existing classifier without modifying its code.
- You need a simple, modular approach to handling concept drift.
- You want to compare different drift detectors while keeping the base model fixed.
- You prefer a "reset-and-retrain" strategy over incremental adaptation (as used in Hoeffding Adaptive Trees).
Theoretical Basis
The drift retraining strategy is based on the principle that when the data-generating distribution changes, a model trained on the old distribution may perform poorly on the new distribution. Rather than incrementally adjusting the model (which may be impossible for some model types), the strategy resets the model entirely.
Formal framework:
Let be the base classifier and be the drift detector. At each time step :
DriftRetraining(model M, detector D, train_in_background):
M_bg = None (background model)
For each (x_t, y_t):
1. y_pred = M.predict_one(x_t)
2. error = int(y_pred != y_t)
3. D.update(error)
4. If train_in_background:
a. If D.warning_detected AND M_bg exists:
M_bg.learn_one(x_t, y_t)
b. If D.drift_detected:
M = M_bg (replace primary with background)
M_bg = M.clone() (fresh background model)
5. Else:
a. If D.drift_detected:
M = M.clone() (reset to fresh model)
6. M.learn_one(x_t, y_t)
Key considerations:
- The drift detector must support both warning and drift signals when
train_in_background=True. River'sDriftAndWarningDetectorinterface (e.g.,drift.binary.DDM) provides both. - The background model begins training from the warning point, not from scratch at drift detection. This gives it a head start on the new concept.
- The error indicator (0 for correct, 1 for incorrect) is a binary signal suitable for drift detectors that monitor the mean error rate.
- Upon reset, the new model starts with no learned state, so there is an inherent cold-start period where predictions may be poor.