Implementation:Online ml River Tree HoeffdingAdaptiveTreeRegressor
| Knowledge Sources | |
|---|---|
| Domains | Online_Learning, Decision_Trees, Regression, Concept_Drift |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
Hoeffding Adaptive Tree Regressor (HATR) is a regression version of the Hoeffding Adaptive Tree that uses ADWIN drift detectors at each node to monitor concept drift. When drift is detected, alternate subtrees are grown in the background and swapped when they demonstrate superior performance.
Description
HATR extends the standard Hoeffding Tree Regressor with adaptive mechanisms to handle non-stationary data streams. Each decision node maintains an ADWIN drift detector that monitors prediction errors. When drift is detected at a node, an alternate tree begins growing in parallel. The algorithm periodically evaluates whether the alternate tree significantly outperforms the current subtree using statistical tests, and swaps them when appropriate.
Key features:
- ADWIN-based drift detection at each node
- Background growth of alternate trees
- Statistical significance testing for tree replacement
- Bootstrap sampling support for improved performance
- Error normalization based on empirical Gaussian distribution assumptions
The error normalization strategy assumes prediction deviations follow a normal distribution, applying min-max normalization in the range [-3σ, 3σ] before feeding errors to ADWIN detectors.
Usage
from river import datasets
from river import evaluate
from river import metrics
from river import tree
from river import preprocessing
dataset = datasets.TrumpApproval()
model = (
preprocessing.StandardScaler() |
tree.HoeffdingAdaptiveTreeRegressor(
grace_period=50,
model_selector_decay=0.3,
seed=0
)
)
metric = metrics.MAE()
evaluate.progressive_val_score(dataset, model, metric)
# MAE: 0.917576
Code Reference
Source Location:
/tmp/kapso_repo_178qi9vb/river/tree/hoeffding_adaptive_tree_regressor.py
Signature:
class HoeffdingAdaptiveTreeRegressor(HoeffdingTreeRegressor):
def __init__(
self,
grace_period: int = 200,
max_depth: int | None = None,
delta: float = 1e-7,
tau: float = 0.05,
leaf_prediction: str = "adaptive",
leaf_model: base.Regressor | None = None,
model_selector_decay: float = 0.95,
nominal_attributes: list | None = None,
splitter: Splitter | None = None,
min_samples_split: int = 5,
bootstrap_sampling: bool = True,
drift_window_threshold: int = 300,
drift_detector: base.DriftDetector | None = None,
switch_significance: float = 0.05,
binary_split: bool = False,
max_size: float = 500.0,
memory_estimate_period: int = 1000000,
stop_mem_management: bool = False,
remove_poor_attrs: bool = False,
merit_preprune: bool = True,
seed: int | None = None,
)
Import:
from river.tree import HoeffdingAdaptiveTreeRegressor
I/O Contract
Input:
- x (dict): Feature dictionary with attribute names as keys
- y (float): Target regression value
- w (float, optional): Sample weight (default: 1.0)
Output:
- predict_one(x): Predicted regression value (float)
Key Parameters
- grace_period (int): Number of instances between split attempts
- leaf_prediction (str): Prediction mechanism ('mean', 'model', 'adaptive')
- leaf_model (Regressor): Base model for leaf predictions (default: LinearRegression)
- model_selector_decay (float): Exponential decay for model selection (0-1)
- bootstrap_sampling (bool): Enable bootstrap sampling in leaves
- drift_window_threshold (int): Minimum observations for alternate tree consideration
- drift_detector (DriftDetector): Drift detection algorithm (default: ADWIN)
- switch_significance (float): Significance level for subtree replacement tests
- seed (int): Random seed for reproducibility
Implementation Details
Key Methods:
- learn_one(x, y, w=1.0): Train on one instance with drift detection
- predict_one(x): Predict by averaging predictions from reached leaves
- _new_leaf(initial_stats, parent, is_active): Create adaptive leaf nodes
- _branch_selector(numerical_feature, multiway_split): Select appropriate branch type
Node Types:
- AdaLeafRegMean: Leaf predicting target mean with drift detection
- AdaLeafRegModel: Leaf using learned model with drift detection
- AdaLeafRegAdaptive: Leaf adaptively choosing between mean and model
- AdaNumBinaryBranchReg/AdaNumMultiwayBranchReg: Numeric branch nodes
- AdaNomBinaryBranchReg/AdaNomMultiwayBranchReg: Nominal branch nodes
Properties:
- n_alternate_trees: Number of alternate trees currently growing
- n_pruned_alternate_trees: Count of pruned alternate trees
- n_switch_alternate_trees: Count of successful tree replacements
Related Pages
- Online_ml_River_Tree_HoeffdingTreeRegressor
- Online_ml_River_Tree_HoeffdingAdaptiveTreeClassifier
- Online_ml_River_Drift_ADWIN
- Online_ml_River_Tree_Base_Nodes
References
Bifet, Albert, and Ricard Gavaldà. "Adaptive learning from evolving data streams." In International Symposium on Intelligent Data Analysis, pp. 249-260. Springer, Berlin, Heidelberg, 2009.