Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Forest OXTRegressor

From Leeroopedia


Knowledge Sources
Domains Online_Learning, Random_Forests, Regression, Concept_Drift
Last Updated 2026-02-08 16:00 GMT

Overview

Online Extra Trees (OXT) is an ensemble regressor that extends randomization beyond feature subsampling to include random split candidates, providing fast adaptation to concept drift.

Description

OXT builds upon Adaptive Random Forests by introducing additional randomization: each tree leaf tests only a single random split point per numerical feature (rather than evaluating multiple candidates), the maximum depth can be randomized across trees, and the feature subspace size can vary per leaf. Trees include drift detection mechanisms that trigger tree replacement when concepts change. Optional resampling (bagging or subbagging) provides diversity. The random splits bring significant speedups but can cause cold-start problems where initial predictions are worse than deterministic methods. Each tree can make predictions using target mean, a linear model, or adaptively choose between them.

Usage

Use OXT for regression on data streams with potential concept drift where prediction speed is critical. It trades some initial accuracy for faster computation and better adaptation. The algorithm works well for large-scale problems and when concept drifts are frequent. Set detection_mode to "all" for drift warning with background trees, "drop" to simply replace drifted trees, or "off" for stationary streams. Increase split_buffer_size for more stable splits at the cost of memory.

Code Reference

Source Location

Signature

class OXTRegressor(
    n_models: int = 10,
    max_features: bool | str | int = "sqrt",
    resampling_strategy: str | None = "subbagging",
    resampling_rate: int | float = 0.5,
    detection_mode: str = "all",
    warning_detector: base.DriftDetector | None = None,
    drift_detector: base.DriftDetector | None = None,
    max_depth: int | None = None,
    randomize_tree_depth: bool = False,
    track_metric: metrics.base.RegressionMetric | None = None,
    disable_weighted_vote: bool = True,
    split_buffer_size: int = 5,
    seed: int | None = None,
    grace_period: int = 50,
    delta: float = 0.01,
    tau: float = 0.05,
    leaf_prediction: str = "adaptive",
    leaf_model: base.Regressor | None = None,
    model_selector_decay: float = 0.95,
    nominal_attributes: list | None = None,
    min_samples_split: int = 5,
    binary_split: bool = False,
    max_size: int = 500,
    memory_estimate_period: int = 2_000_000,
    stop_mem_management: bool = False,
    remove_poor_attrs: bool = False,
    merit_preprune: bool = True,
)

Import

from river import forest

I/O Contract

Input
Parameter Type Description
x dict Feature dictionary with numerical or categorical features
y float Target value for regression
Output
Method Return Type Description
predict_one(x) float Predicted regression value (average or weighted)
learn_one(x, y) None Updates trees with drift detection and resampling

Usage Examples

from river import datasets
from river import evaluate
from river import metrics
from river import forest

dataset = datasets.synth.Friedman(seed=42).take(5000)

model = forest.OXTRegressor(n_models=3, seed=42)

metric = metrics.RMSE()

result = evaluate.progressive_val_score(dataset, model, metric)
print(result)  # RMSE: 3.16212

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment