Implementation:Online ml River Tree HoeffdingAdaptiveTreeClassifier
| Knowledge Sources | Domains | Last Updated |
|---|---|---|
| River River Docs Adaptive Learning from Evolving Data Streams | Online Machine Learning, Concept Drift, Decision Trees | 2026-02-08 16:00 GMT |
Overview
Concrete tool for building an online decision tree classifier that automatically adapts its structure in response to concept drift using per-node ADWIN drift detectors.
Description
The tree.HoeffdingAdaptiveTreeClassifier extends HoeffdingTreeClassifier with drift-adaptive capabilities. Each node in the tree maintains its own ADWIN drift detector that monitors the local classification error. When drift is detected at a node, an alternate subtree is grown. If the alternate subtree proves statistically superior (assessed via the switch_significance parameter after observing at least drift_window_threshold instances), it replaces the original subtree. The tree tracks the total number of alternate trees created, pruned, and switched via dedicated properties.
Bootstrap sampling is enabled by default (bootstrap_sampling=True), which weights each training instance with a Poisson-distributed random value to improve diversity and adaptation.
Usage
Import this classifier when you need a single drift-adaptive decision tree for online classification on non-stationary data streams. It inherits all parameters from HoeffdingTreeClassifier and adds drift-specific parameters.
Code Reference
Source Location
river/tree/hoeffding_adaptive_tree_classifier.py:L22-L287
Signature
class HoeffdingAdaptiveTreeClassifier(HoeffdingTreeClassifier):
def __init__(
self,
grace_period: int = 200,
max_depth: int | None = None,
split_criterion: str = "info_gain",
delta: float = 1e-7,
tau: float = 0.05,
leaf_prediction: str = "nba",
nb_threshold: int = 0,
nominal_attributes: list | None = None,
splitter: Splitter | None = None,
bootstrap_sampling: bool = True,
drift_window_threshold: int = 300,
drift_detector: base.DriftDetector | None = None,
switch_significance: float = 0.05,
binary_split: bool = False,
min_branch_fraction: float = 0.01,
max_share_to_split: float = 0.99,
max_size: float = 100.0,
memory_estimate_period: int = 1000000,
stop_mem_management: bool = False,
remove_poor_attrs: bool = False,
merit_preprune: bool = True,
seed: int | None = None,
)
Import
from river import tree
Key Parameters (Drift-Specific)
| Parameter | Type | Default | Description |
|---|---|---|---|
bootstrap_sampling |
bool | True | If True, perform bootstrap sampling (Poisson weighting) in leaf nodes |
drift_window_threshold |
int | 300 | Minimum instances an alternate tree must observe before being considered as a replacement |
drift_detector |
DriftDetector or None | None (defaults to ADWIN()) | The drift detector used per node; if None, uses drift.ADWIN()
|
switch_significance |
float | 0.05 | Significance level for assessing whether alternate subtrees are better than originals |
seed |
int or None | None | Random seed for reproducibility |
I/O Contract
Inputs
| Method | Parameter | Type | Description |
|---|---|---|---|
learn_one |
x | dict | Feature dictionary |
learn_one |
y | ClfTarget | Target class label |
learn_one |
w | float (keyword, default=1.0) | Instance weight |
predict_proba_one |
x | dict | Feature dictionary |
predict_one |
x | dict | Feature dictionary |
Outputs
| Method | Return Type | Description |
|---|---|---|
predict_proba_one(x) |
dict[ClfTarget, float] | Class probability distribution (normalized) |
predict_one(x) |
ClfTarget | Predicted class label (argmax of probabilities) |
n_alternate_trees |
int | Total number of alternate trees currently in the tree |
n_pruned_alternate_trees |
int | Total number of alternate trees that were pruned |
n_switch_alternate_trees |
int | Total number of alternate trees that replaced original subtrees |
Usage Examples
Basic Drift-Adaptive Classification
from river import datasets, evaluate, metrics, tree
dataset = datasets.synth.ConceptDriftStream(
stream=datasets.synth.SEA(seed=42, variant=0),
drift_stream=datasets.synth.SEA(seed=42, variant=1),
seed=1, position=500, width=50
)
dataset = iter(dataset.take(1000))
model = tree.HoeffdingAdaptiveTreeClassifier(
grace_period=100,
delta=1e-5,
leaf_prediction='nb',
nb_threshold=10,
seed=0
)
metric = metrics.Accuracy()
evaluate.progressive_val_score(dataset, model, metric)
# Accuracy: 91.49%
Using on the Elec2 Dataset
from river import datasets, evaluate, metrics, tree
dataset = datasets.Elec2().take(5000)
model = tree.HoeffdingAdaptiveTreeClassifier(
grace_period=200,
drift_window_threshold=300,
seed=42
)
metric = metrics.Accuracy()
evaluate.progressive_val_score(dataset, model, metric)
Inspecting Alternate Tree Statistics
from river import datasets, tree
model = tree.HoeffdingAdaptiveTreeClassifier(seed=42)
for x, y in datasets.Elec2().take(10000):
model.learn_one(x, y)
print(f"Alternate trees: {model.n_alternate_trees}")
print(f"Pruned alternates: {model.n_pruned_alternate_trees}")
print(f"Switched alternates: {model.n_switch_alternate_trees}")