Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Ensemble Boosting

From Leeroopedia


Knowledge Sources
Domains Online_Learning, Ensemble_Methods, Boosting
Last Updated 2026-02-08 16:00 GMT

Overview

Online boosting classifiers that adaptively weight training instances using Poisson-distributed resampling for sequential learning.

Description

This module implements three boosting variants: AdaBoostClassifier (basic online boosting), ADWINBoostingClassifier (with drift detection), and BOLEClassifier (optimized ordering). All use Poisson(lambda) resampling where lambda adapts based on model performance. AdaBoost updates lambda based on correctness, increasing for errors and decreasing for successes. ADWINBoosting adds drift detection to replace poorly performing models. BOLE reorders model training, prioritizing worst performers and adjusting lambda based on prediction correctness, creating a "rich get richer" dynamic for better models.

Usage

Use AdaBoostClassifier for basic online boosting with any base classifier. Choose ADWINBoostingClassifier when concept drift is expected, as it automatically detects and adapts to changes. Use BOLEClassifier for imbalanced datasets or when you want more sophisticated model ordering. All three work well with decision trees as base learners and can significantly improve performance over single models.

Code Reference

Source Location

Signature

class AdaBoostClassifier(base.WrapperEnsemble, base.Classifier):
    def __init__(self, model: base.Classifier, n_models=10, seed: int | None = None):
        super().__init__(model, n_models, seed)
        self.wrong_weight: collections.defaultdict = collections.defaultdict(int)
        self.correct_weight: collections.defaultdict = collections.defaultdict(int)

class ADWINBoostingClassifier(AdaBoostClassifier):
    def __init__(self, model: base.Classifier, n_models=10, seed: int | None = None):
        super().__init__(model, n_models, seed)
        self._drift_detectors = [drift.ADWIN() for _ in range(self.n_models)]

class BOLEClassifier(AdaBoostClassifier):
    def __init__(
        self, model: base.Classifier, n_models=10, seed: int | None = None, error_bound=0.5
    ):
        super().__init__(model=model, n_models=n_models, seed=seed)
        self.error_bound = error_bound
        self.order_position = [i for i in range(n_models)]
        self.instances_seen = 0

Import

from river import ensemble

I/O Contract

Parameters

Parameter Type Default Description
model Classifier required Base classifier to boost
n_models int 10 Number of models in ensemble
seed int or None None Random seed for reproducibility
error_bound float 0.5 Error threshold for BOLE voting (BOLE only)

Attributes

Attribute Type Description
wrong_weight defaultdict Cumulative error weights per model
correct_weight defaultdict Cumulative correct weights per model
models list Ensemble of base classifiers

Input/Output

Method Input Output
learn_one x: dict, y: Any None
predict_proba_one x: dict dict[Any, float]

Usage Examples

# AdaBoostClassifier
from river import datasets
from river import ensemble
from river import evaluate
from river import metrics
from river import tree

dataset = datasets.Phishing()

metric = metrics.LogLoss()

model = ensemble.AdaBoostClassifier(
    model=(
        tree.HoeffdingTreeClassifier(
            split_criterion='gini',
            delta=1e-5,
            grace_period=2000
        )
    ),
    n_models=5,
    seed=42
)

evaluate.progressive_val_score(dataset, model, metric)
# LogLoss: 0.370805

# ADWINBoostingClassifier
from river import preprocessing

dataset = datasets.Phishing()
model = ensemble.ADWINBoostingClassifier(
    model=(
        preprocessing.StandardScaler() |
        linear_model.LogisticRegression()
    ),
    n_models=3,
    seed=42
)
metric = metrics.F1()

evaluate.progressive_val_score(dataset, model, metric)
# F1: 87.61%

# BOLEClassifier
from river import drift

dataset = datasets.Elec2().take(3000)

model = ensemble.BOLEClassifier(
    model=drift.DriftRetrainingClassifier(
        model=tree.HoeffdingTreeClassifier(),
        drift_detector=drift.binary.DDM()
    ),
    n_models=10,
    seed=42
)

metric = metrics.Accuracy()

evaluate.progressive_val_score(dataset, model, metric)
# Accuracy: 93.63%

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment