Implementation:Online ml River Tree HoeffdingAdaptiveTreeClassifier

Knowledge Sources	Domains	Last Updated
River River Docs Adaptive Learning from Evolving Data Streams	Online Machine Learning, Concept Drift, Decision Trees	2026-02-08 16:00 GMT

Overview

Concrete tool for building an online decision tree classifier that automatically adapts its structure in response to concept drift using per-node ADWIN drift detectors.

Description

The tree.HoeffdingAdaptiveTreeClassifier extends HoeffdingTreeClassifier with drift-adaptive capabilities. Each node in the tree maintains its own ADWIN drift detector that monitors the local classification error. When drift is detected at a node, an alternate subtree is grown. If the alternate subtree proves statistically superior (assessed via the switch_significance parameter after observing at least drift_window_threshold instances), it replaces the original subtree. The tree tracks the total number of alternate trees created, pruned, and switched via dedicated properties.

Bootstrap sampling is enabled by default (bootstrap_sampling=True), which weights each training instance with a Poisson-distributed random value to improve diversity and adaptation.

Usage

Import this classifier when you need a single drift-adaptive decision tree for online classification on non-stationary data streams. It inherits all parameters from HoeffdingTreeClassifier and adds drift-specific parameters.

Code Reference

Source Location

river/tree/hoeffding_adaptive_tree_classifier.py:L22-L287

Signature

class HoeffdingAdaptiveTreeClassifier(HoeffdingTreeClassifier):
    def __init__(
        self,
        grace_period: int = 200,
        max_depth: int | None = None,
        split_criterion: str = "info_gain",
        delta: float = 1e-7,
        tau: float = 0.05,
        leaf_prediction: str = "nba",
        nb_threshold: int = 0,
        nominal_attributes: list | None = None,
        splitter: Splitter | None = None,
        bootstrap_sampling: bool = True,
        drift_window_threshold: int = 300,
        drift_detector: base.DriftDetector | None = None,
        switch_significance: float = 0.05,
        binary_split: bool = False,
        min_branch_fraction: float = 0.01,
        max_share_to_split: float = 0.99,
        max_size: float = 100.0,
        memory_estimate_period: int = 1000000,
        stop_mem_management: bool = False,
        remove_poor_attrs: bool = False,
        merit_preprune: bool = True,
        seed: int | None = None,
    )

Import

from river import tree

Key Parameters (Drift-Specific)

Parameter	Type	Default	Description
`bootstrap_sampling`	bool	True	If True, perform bootstrap sampling (Poisson weighting) in leaf nodes
`drift_window_threshold`	int	300	Minimum instances an alternate tree must observe before being considered as a replacement
`drift_detector`	DriftDetector or None	None (defaults to ADWIN())	The drift detector used per node; if None, uses `drift.ADWIN()`
`switch_significance`	float	0.05	Significance level for assessing whether alternate subtrees are better than originals
`seed`	int or None	None	Random seed for reproducibility

I/O Contract

Inputs

Method	Parameter	Type	Description
`learn_one`	x	dict	Feature dictionary
`learn_one`	y	ClfTarget	Target class label
`learn_one`	w	float (keyword, default=1.0)	Instance weight
`predict_proba_one`	x	dict	Feature dictionary
`predict_one`	x	dict	Feature dictionary

Outputs

Method	Return Type	Description
`predict_proba_one(x)`	dict[ClfTarget, float]	Class probability distribution (normalized)
`predict_one(x)`	ClfTarget	Predicted class label (argmax of probabilities)
`n_alternate_trees`	int	Total number of alternate trees currently in the tree
`n_pruned_alternate_trees`	int	Total number of alternate trees that were pruned
`n_switch_alternate_trees`	int	Total number of alternate trees that replaced original subtrees

Usage Examples

Basic Drift-Adaptive Classification

from river import datasets, evaluate, metrics, tree

dataset = datasets.synth.ConceptDriftStream(
    stream=datasets.synth.SEA(seed=42, variant=0),
    drift_stream=datasets.synth.SEA(seed=42, variant=1),
    seed=1, position=500, width=50
)
dataset = iter(dataset.take(1000))

model = tree.HoeffdingAdaptiveTreeClassifier(
    grace_period=100,
    delta=1e-5,
    leaf_prediction='nb',
    nb_threshold=10,
    seed=0
)

metric = metrics.Accuracy()
evaluate.progressive_val_score(dataset, model, metric)
# Accuracy: 91.49%

Using on the Elec2 Dataset

from river import datasets, evaluate, metrics, tree

dataset = datasets.Elec2().take(5000)

model = tree.HoeffdingAdaptiveTreeClassifier(
    grace_period=200,
    drift_window_threshold=300,
    seed=42
)

metric = metrics.Accuracy()
evaluate.progressive_val_score(dataset, model, metric)

Inspecting Alternate Tree Statistics

from river import datasets, tree

model = tree.HoeffdingAdaptiveTreeClassifier(seed=42)

for x, y in datasets.Elec2().take(10000):
    model.learn_one(x, y)

print(f"Alternate trees: {model.n_alternate_trees}")
print(f"Pruned alternates: {model.n_pruned_alternate_trees}")
print(f"Switched alternates: {model.n_switch_alternate_trees}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment