Principle:Online ml River Drift Retraining

Knowledge Sources	Domains	Last Updated
River River Docs	Online Machine Learning, Concept Drift, Meta-Learning	2026-02-08 16:00 GMT

Overview

Drift retraining is a meta-learning strategy that wraps a base classifier with a drift detector and automatically retrains the model from scratch when concept drift is detected.

Description

The drift retraining approach treats concept drift adaptation as a wrapper pattern: any base classifier can be augmented with drift detection and automatic retraining capabilities without modifying its internal implementation. The wrapper monitors the model's prediction errors by feeding them to a drift detector. When the detector signals a warning or a drift, the wrapper takes corrective action.

This strategy offers two operational modes:

Background training (train_in_background=True): When a warning is detected, a fresh clone of the base model starts training on incoming data in parallel. If the warning escalates to a confirmed drift, the background model replaces the primary model. This provides a smoother transition because the replacement model has already been partially trained on post-drift data.

Immediate reset (train_in_background=False): When drift is detected, the primary model is simply reset (cloned fresh) with no background pre-training. This is faster to react but may cause a temporary performance dip while the new model catches up.

The key trade-off is between adaptation speed (how quickly the model adjusts to the new concept) and stability (how much performance degrades during the transition).

Usage

Use drift retraining when:

You want to add drift adaptation to any existing classifier without modifying its code.
You need a simple, modular approach to handling concept drift.
You want to compare different drift detectors while keeping the base model fixed.
You prefer a "reset-and-retrain" strategy over incremental adaptation (as used in Hoeffding Adaptive Trees).

Theoretical Basis

The drift retraining strategy is based on the principle that when the data-generating distribution changes, a model trained on the old distribution may perform poorly on the new distribution. Rather than incrementally adjusting the model (which may be impossible for some model types), the strategy resets the model entirely.

Formal framework:

Let $M$ be the base classifier and $D$ be the drift detector. At each time step $t$ :

DriftRetraining(model M, detector D, train_in_background):
    M_bg = None  (background model)

    For each (x_t, y_t):
        1. y_pred = M.predict_one(x_t)
        2. error = int(y_pred != y_t)
        3. D.update(error)

        4. If train_in_background:
           a. If D.warning_detected AND M_bg exists:
              M_bg.learn_one(x_t, y_t)
           b. If D.drift_detected:
              M = M_bg        (replace primary with background)
              M_bg = M.clone() (fresh background model)
        5. Else:
           a. If D.drift_detected:
              M = M.clone()   (reset to fresh model)

        6. M.learn_one(x_t, y_t)

Key considerations:

The drift detector must support both warning and drift signals when train_in_background=True. River's DriftAndWarningDetector interface (e.g., drift.binary.DDM) provides both.
The background model begins training from the warning point, not from scratch at drift detection. This gives it a head start on the new concept.
The error indicator (0 for correct, 1 for incorrect) is a binary signal suitable for drift detectors that monitor the mean error rate.
Upon reset, the new model starts with no learned state, so there is an inherent cold-start period where predictions may be poor.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment