Implementation:Online ml River Tree Splitter EBST
| Knowledge Sources | |
|---|---|
| Domains | Online_Learning, Decision_Trees, Regression |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
Extended Binary Search Tree (E-BST) splitter for regression trees that stores all observations between splits and performs memory management.
Description
EBSTSplitter implements the E-BST structure from iSOUP-Tree, storing feature values and target statistics in a binary search tree. It enables split candidate evaluation at any time through in-order traversal. The splitter stores only left branch statistics and computes complete statistics during traversal. A memory management routine removes bad split candidates based on merit ratios to prevent excessive memory usage.
Usage
Use EBSTSplitter when building regression trees that require exhaustive split evaluation. Memory management can be triggered after failed split attempts to remove poor candidates.
Code Reference
Source Location
- Repository: Online_ml_River
- File: river/tree/splitter/ebst_splitter.py
Signature
class EBSTSplitter(Splitter):
def __init__(self):
...
def update(self, att_val, target_val, w):
...
def cond_proba(self, att_val, target_val):
raise NotImplementedError
def best_evaluated_split_suggestion(self, criterion, pre_split_dist, att_idx, binary_only=True):
...
def remove_bad_splits(
self,
criterion,
last_check_ratio: float,
last_check_vr: float,
last_check_e: float,
pre_split_dist: list | dict,
):
...
@property
def is_target_class(self) -> bool:
return False
class EBSTNode:
def __init__(self, att_val, target_val, w):
...
def insert_value(self, att_val, target_val, w):
...
Import
from river.tree.splitter import EBSTSplitter
I/O Contract
| Input | Type | Description |
|---|---|---|
| att_val | float | Numerical feature value |
| target_val | float/dict | Target value (supports multi-target) |
| w | float | Sample weight |
| criterion | SplitCriterion | Split evaluation criterion |
| Output | Type | Description |
|---|---|---|
| split_suggestion | BranchFactory | Best binary split with merit and statistics |
| removed_nodes | int | Number of bad split candidates removed |
Usage Examples
from river.tree.splitter import EBSTSplitter
from river.tree.split_criterion import VarianceRatioSplitCriterion
from river.stats import Var
splitter = EBSTSplitter()
# Update with observations
splitter.update(5.5, 10.2, 1.0)
splitter.update(6.2, 12.5, 1.0)
splitter.update(4.8, 9.1, 1.0)
# Get best split
criterion = VarianceRatioSplitCriterion()
pre_split = Var()
pre_split.update(10.2, 1.0)
pre_split.update(12.5, 1.0)
pre_split.update(9.1, 1.0)
suggestion = splitter.best_evaluated_split_suggestion(
criterion=criterion,
pre_split_dist=pre_split,
att_idx='feature1',
binary_only=True
)
# Remove bad splits after failed split attempt
splitter.remove_bad_splits(
criterion=criterion,
last_check_ratio=0.95,
last_check_vr=0.5,
last_check_e=0.01,
pre_split_dist=pre_split
)