Implementation:Online ml River Forest OXTRegressor

Knowledge Sources	Online_ml_River
Domains	Online_Learning, Random_Forests, Regression, Concept_Drift
Last Updated	2026-02-08 16:00 GMT

Overview

Online Extra Trees (OXT) is an ensemble regressor that extends randomization beyond feature subsampling to include random split candidates, providing fast adaptation to concept drift.

Description

OXT builds upon Adaptive Random Forests by introducing additional randomization: each tree leaf tests only a single random split point per numerical feature (rather than evaluating multiple candidates), the maximum depth can be randomized across trees, and the feature subspace size can vary per leaf. Trees include drift detection mechanisms that trigger tree replacement when concepts change. Optional resampling (bagging or subbagging) provides diversity. The random splits bring significant speedups but can cause cold-start problems where initial predictions are worse than deterministic methods. Each tree can make predictions using target mean, a linear model, or adaptively choose between them.

Usage

Use OXT for regression on data streams with potential concept drift where prediction speed is critical. It trades some initial accuracy for faster computation and better adaptation. The algorithm works well for large-scale problems and when concept drifts are frequent. Set detection_mode to "all" for drift warning with background trees, "drop" to simply replace drifted trees, or "off" for stationary streams. Increase split_buffer_size for more stable splits at the cost of memory.

Code Reference

Source Location

Repository: Online_ml_River
File: river/forest/online_extra_trees.py

Signature

class OXTRegressor(
    n_models: int = 10,
    max_features: bool | str | int = "sqrt",
    resampling_strategy: str | None = "subbagging",
    resampling_rate: int | float = 0.5,
    detection_mode: str = "all",
    warning_detector: base.DriftDetector | None = None,
    drift_detector: base.DriftDetector | None = None,
    max_depth: int | None = None,
    randomize_tree_depth: bool = False,
    track_metric: metrics.base.RegressionMetric | None = None,
    disable_weighted_vote: bool = True,
    split_buffer_size: int = 5,
    seed: int | None = None,
    grace_period: int = 50,
    delta: float = 0.01,
    tau: float = 0.05,
    leaf_prediction: str = "adaptive",
    leaf_model: base.Regressor | None = None,
    model_selector_decay: float = 0.95,
    nominal_attributes: list | None = None,
    min_samples_split: int = 5,
    binary_split: bool = False,
    max_size: int = 500,
    memory_estimate_period: int = 2_000_000,
    stop_mem_management: bool = False,
    remove_poor_attrs: bool = False,
    merit_preprune: bool = True,
)

Import

from river import forest

I/O Contract

Input
Parameter	Type	Description
x	dict	Feature dictionary with numerical or categorical features
y	float	Target value for regression

Output
Method	Return Type	Description
predict_one(x)	float	Predicted regression value (average or weighted)
learn_one(x, y)	None	Updates trees with drift detection and resampling

Usage Examples

from river import datasets
from river import evaluate
from river import metrics
from river import forest

dataset = datasets.synth.Friedman(seed=42).take(5000)

model = forest.OXTRegressor(n_models=3, seed=42)

metric = metrics.RMSE()

result = evaluate.progressive_val_score(dataset, model, metric)
print(result)  # RMSE: 3.16212

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment