Principle:Online ml River Online Ensemble Methods

Knowledge Sources	Domains	Last Updated
Machine Learning Online Bagging and Boosting	Online_Learning, Ensemble_Learning, Classification	2026-02-08 18:00 GMT

Overview

Online ensemble methods combine multiple base learners that are trained incrementally on streaming data to produce predictions that are more accurate and robust than any single learner. Classic ensemble paradigms such as bagging, boosting, stacking, and voting are adapted to the online setting where data arrives one instance at a time.

Description

Ensemble learning is one of the most powerful techniques in machine learning. In the batch setting, methods like Random Forest (bagging), AdaBoost (boosting), and model stacking are well-established. Adapting these to online learning requires fundamental changes because the entire dataset is never available at once.

Key online ensemble paradigms include:

Voting: Each base learner independently processes each instance. Predictions are combined by majority vote (classification) or averaging (regression). This is the simplest ensemble strategy and requires no special coordination between learners.

Online Boosting: Inspired by AdaBoost, online boosting assigns instance-dependent weights to base learners. Each learner receives a weight for each training instance based on the performance of the preceding learners. The Oza-Russell online boosting algorithm simulates batch boosting by using Poisson-distributed resampling weights.

Exponentially Weighted Average (EWA): Each base learner maintains a weight that is updated exponentially based on its recent loss. Predictions are the weighted average of base learner outputs. The EWA regressor implements this for regression tasks.

Stacking: A meta-learner is trained on the predictions of the base learners. In the online setting, the meta-learner and the base learners are all updated incrementally with each new instance.

Streaming Random Patches (SRP): Combines random subspace sampling (feature bagging) with online bagging to create diverse ensembles that are effective in non-stationary environments.

Usage

Use online ensemble methods when:

You need higher predictive accuracy than a single online learner can achieve.
You want robustness against concept drift through learner diversity.
You need uncertainty estimates from ensemble disagreement.
You want to combine heterogeneous models in a streaming pipeline.

Theoretical Basis

Online Boosting (Oza-Russell)

Initialize: N base learners h_1, ..., h_N with weight lambda = 1
For each instance (x, y):
    lambda = 1
    For m = 1 to N:
        Set weight k ~ Poisson(lambda)
        Train h_m on (x, y) with weight k
        if h_m correctly classifies x:
            lambda = lambda * 1 / (2 * (1 - epsilon_m))
        else:
            lambda = lambda * 1 / (2 * epsilon_m)
        where epsilon_m is the running error rate of h_m

Exponentially Weighted Average

Initialize: N experts with weights w_1 = ... = w_N = 1/N
           Learning rate eta
For each instance (x, y):
    Predict: hat{y} = sum(w_m * h_m(x)) / sum(w_m)
    For each expert m:
        loss_m = L(y, h_m(x))
        w_m = w_m * exp(-eta * loss_m)
    Normalize weights: w_m = w_m / sum(w_j)

Stacking

Initialize: K base learners, 1 meta-learner M
For each instance (x, y):
    base_preds = [h_k.predict(x) for k in 1..K]
    final_pred = M.predict(base_preds)
    M.learn(base_preds, y)
    For each h_k: h_k.learn(x, y)

The theoretical advantage of ensembles comes from the bias-variance decomposition: combining diverse learners reduces variance while maintaining low bias, leading to improved generalization.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment