Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Online ml River Tree HoeffdingAdaptiveTreeClassifier

From Leeroopedia


Knowledge Sources Domains Last Updated
River River Docs Adaptive Learning from Evolving Data Streams Online Machine Learning, Concept Drift, Decision Trees 2026-02-08 16:00 GMT

Overview

Concrete tool for building an online decision tree classifier that automatically adapts its structure in response to concept drift using per-node ADWIN drift detectors.

Description

The tree.HoeffdingAdaptiveTreeClassifier extends HoeffdingTreeClassifier with drift-adaptive capabilities. Each node in the tree maintains its own ADWIN drift detector that monitors the local classification error. When drift is detected at a node, an alternate subtree is grown. If the alternate subtree proves statistically superior (assessed via the switch_significance parameter after observing at least drift_window_threshold instances), it replaces the original subtree. The tree tracks the total number of alternate trees created, pruned, and switched via dedicated properties.

Bootstrap sampling is enabled by default (bootstrap_sampling=True), which weights each training instance with a Poisson-distributed random value to improve diversity and adaptation.

Usage

Import this classifier when you need a single drift-adaptive decision tree for online classification on non-stationary data streams. It inherits all parameters from HoeffdingTreeClassifier and adds drift-specific parameters.

Code Reference

Source Location

river/tree/hoeffding_adaptive_tree_classifier.py:L22-L287

Signature

class HoeffdingAdaptiveTreeClassifier(HoeffdingTreeClassifier):
    def __init__(
        self,
        grace_period: int = 200,
        max_depth: int | None = None,
        split_criterion: str = "info_gain",
        delta: float = 1e-7,
        tau: float = 0.05,
        leaf_prediction: str = "nba",
        nb_threshold: int = 0,
        nominal_attributes: list | None = None,
        splitter: Splitter | None = None,
        bootstrap_sampling: bool = True,
        drift_window_threshold: int = 300,
        drift_detector: base.DriftDetector | None = None,
        switch_significance: float = 0.05,
        binary_split: bool = False,
        min_branch_fraction: float = 0.01,
        max_share_to_split: float = 0.99,
        max_size: float = 100.0,
        memory_estimate_period: int = 1000000,
        stop_mem_management: bool = False,
        remove_poor_attrs: bool = False,
        merit_preprune: bool = True,
        seed: int | None = None,
    )

Import

from river import tree

Key Parameters (Drift-Specific)

Parameter Type Default Description
bootstrap_sampling bool True If True, perform bootstrap sampling (Poisson weighting) in leaf nodes
drift_window_threshold int 300 Minimum instances an alternate tree must observe before being considered as a replacement
drift_detector DriftDetector or None None (defaults to ADWIN()) The drift detector used per node; if None, uses drift.ADWIN()
switch_significance float 0.05 Significance level for assessing whether alternate subtrees are better than originals
seed int or None None Random seed for reproducibility

I/O Contract

Inputs

Method Parameter Type Description
learn_one x dict Feature dictionary
learn_one y ClfTarget Target class label
learn_one w float (keyword, default=1.0) Instance weight
predict_proba_one x dict Feature dictionary
predict_one x dict Feature dictionary

Outputs

Method Return Type Description
predict_proba_one(x) dict[ClfTarget, float] Class probability distribution (normalized)
predict_one(x) ClfTarget Predicted class label (argmax of probabilities)
n_alternate_trees int Total number of alternate trees currently in the tree
n_pruned_alternate_trees int Total number of alternate trees that were pruned
n_switch_alternate_trees int Total number of alternate trees that replaced original subtrees

Usage Examples

Basic Drift-Adaptive Classification

from river import datasets, evaluate, metrics, tree

dataset = datasets.synth.ConceptDriftStream(
    stream=datasets.synth.SEA(seed=42, variant=0),
    drift_stream=datasets.synth.SEA(seed=42, variant=1),
    seed=1, position=500, width=50
)
dataset = iter(dataset.take(1000))

model = tree.HoeffdingAdaptiveTreeClassifier(
    grace_period=100,
    delta=1e-5,
    leaf_prediction='nb',
    nb_threshold=10,
    seed=0
)

metric = metrics.Accuracy()
evaluate.progressive_val_score(dataset, model, metric)
# Accuracy: 91.49%

Using on the Elec2 Dataset

from river import datasets, evaluate, metrics, tree

dataset = datasets.Elec2().take(5000)

model = tree.HoeffdingAdaptiveTreeClassifier(
    grace_period=200,
    drift_window_threshold=300,
    seed=42
)

metric = metrics.Accuracy()
evaluate.progressive_val_score(dataset, model, metric)

Inspecting Alternate Tree Statistics

from river import datasets, tree

model = tree.HoeffdingAdaptiveTreeClassifier(seed=42)

for x, y in datasets.Elec2().take(10000):
    model.learn_one(x, y)

print(f"Alternate trees: {model.n_alternate_trees}")
print(f"Pruned alternates: {model.n_pruned_alternate_trees}")
print(f"Switched alternates: {model.n_switch_alternate_trees}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment