Implementation:Online ml River Tree HoeffdingAdaptiveTreeRegressor

Knowledge Sources	Online_ml_River
Domains	Online_Learning, Decision_Trees, Regression, Concept_Drift
Last Updated	2026-02-08 16:00 GMT

Overview

Hoeffding Adaptive Tree Regressor (HATR) is a regression version of the Hoeffding Adaptive Tree that uses ADWIN drift detectors at each node to monitor concept drift. When drift is detected, alternate subtrees are grown in the background and swapped when they demonstrate superior performance.

Description

HATR extends the standard Hoeffding Tree Regressor with adaptive mechanisms to handle non-stationary data streams. Each decision node maintains an ADWIN drift detector that monitors prediction errors. When drift is detected at a node, an alternate tree begins growing in parallel. The algorithm periodically evaluates whether the alternate tree significantly outperforms the current subtree using statistical tests, and swaps them when appropriate.

Key features:

ADWIN-based drift detection at each node
Background growth of alternate trees
Statistical significance testing for tree replacement
Bootstrap sampling support for improved performance
Error normalization based on empirical Gaussian distribution assumptions

The error normalization strategy assumes prediction deviations follow a normal distribution, applying min-max normalization in the range [-3σ, 3σ] before feeding errors to ADWIN detectors.

Usage

from river import datasets
from river import evaluate
from river import metrics
from river import tree
from river import preprocessing

dataset = datasets.TrumpApproval()

model = (
    preprocessing.StandardScaler() |
    tree.HoeffdingAdaptiveTreeRegressor(
        grace_period=50,
        model_selector_decay=0.3,
        seed=0
    )
)

metric = metrics.MAE()

evaluate.progressive_val_score(dataset, model, metric)
# MAE: 0.917576

Code Reference

Source Location: /tmp/kapso_repo_178qi9vb/river/tree/hoeffding_adaptive_tree_regressor.py

Signature:

class HoeffdingAdaptiveTreeRegressor(HoeffdingTreeRegressor):
    def __init__(
        self,
        grace_period: int = 200,
        max_depth: int | None = None,
        delta: float = 1e-7,
        tau: float = 0.05,
        leaf_prediction: str = "adaptive",
        leaf_model: base.Regressor | None = None,
        model_selector_decay: float = 0.95,
        nominal_attributes: list | None = None,
        splitter: Splitter | None = None,
        min_samples_split: int = 5,
        bootstrap_sampling: bool = True,
        drift_window_threshold: int = 300,
        drift_detector: base.DriftDetector | None = None,
        switch_significance: float = 0.05,
        binary_split: bool = False,
        max_size: float = 500.0,
        memory_estimate_period: int = 1000000,
        stop_mem_management: bool = False,
        remove_poor_attrs: bool = False,
        merit_preprune: bool = True,
        seed: int | None = None,
    )

Import:

from river.tree import HoeffdingAdaptiveTreeRegressor

I/O Contract

Input:

x (dict): Feature dictionary with attribute names as keys
y (float): Target regression value
w (float, optional): Sample weight (default: 1.0)

Output:

predict_one(x): Predicted regression value (float)

Key Parameters

grace_period (int): Number of instances between split attempts
leaf_prediction (str): Prediction mechanism ('mean', 'model', 'adaptive')
leaf_model (Regressor): Base model for leaf predictions (default: LinearRegression)
model_selector_decay (float): Exponential decay for model selection (0-1)
bootstrap_sampling (bool): Enable bootstrap sampling in leaves
drift_window_threshold (int): Minimum observations for alternate tree consideration
drift_detector (DriftDetector): Drift detection algorithm (default: ADWIN)
switch_significance (float): Significance level for subtree replacement tests
seed (int): Random seed for reproducibility

Implementation Details

Key Methods:

learn_one(x, y, w=1.0): Train on one instance with drift detection
predict_one(x): Predict by averaging predictions from reached leaves
_new_leaf(initial_stats, parent, is_active): Create adaptive leaf nodes
_branch_selector(numerical_feature, multiway_split): Select appropriate branch type

Node Types:

AdaLeafRegMean: Leaf predicting target mean with drift detection
AdaLeafRegModel: Leaf using learned model with drift detection
AdaLeafRegAdaptive: Leaf adaptively choosing between mean and model
AdaNumBinaryBranchReg/AdaNumMultiwayBranchReg: Numeric branch nodes
AdaNomBinaryBranchReg/AdaNomMultiwayBranchReg: Nominal branch nodes

Properties:

n_alternate_trees: Number of alternate trees currently growing
n_pruned_alternate_trees: Count of pruned alternate trees
n_switch_alternate_trees: Count of successful tree replacements

Related Pages

References

Bifet, Albert, and Ricard Gavaldà. "Adaptive learning from evolving data streams." In International Symposium on Intelligent Data Analysis, pp. 249-260. Springer, Berlin, Heidelberg, 2009.

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment