Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Tree Splitter EBST

From Leeroopedia


Knowledge Sources
Domains Online_Learning, Decision_Trees, Regression
Last Updated 2026-02-08 16:00 GMT

Overview

Extended Binary Search Tree (E-BST) splitter for regression trees that stores all observations between splits and performs memory management.

Description

EBSTSplitter implements the E-BST structure from iSOUP-Tree, storing feature values and target statistics in a binary search tree. It enables split candidate evaluation at any time through in-order traversal. The splitter stores only left branch statistics and computes complete statistics during traversal. A memory management routine removes bad split candidates based on merit ratios to prevent excessive memory usage.

Usage

Use EBSTSplitter when building regression trees that require exhaustive split evaluation. Memory management can be triggered after failed split attempts to remove poor candidates.

Code Reference

Source Location

Signature

class EBSTSplitter(Splitter):
    def __init__(self):
        ...

    def update(self, att_val, target_val, w):
        ...

    def cond_proba(self, att_val, target_val):
        raise NotImplementedError

    def best_evaluated_split_suggestion(self, criterion, pre_split_dist, att_idx, binary_only=True):
        ...

    def remove_bad_splits(
        self,
        criterion,
        last_check_ratio: float,
        last_check_vr: float,
        last_check_e: float,
        pre_split_dist: list | dict,
    ):
        ...

    @property
    def is_target_class(self) -> bool:
        return False


class EBSTNode:
    def __init__(self, att_val, target_val, w):
        ...

    def insert_value(self, att_val, target_val, w):
        ...

Import

from river.tree.splitter import EBSTSplitter

I/O Contract

Input Type Description
att_val float Numerical feature value
target_val float/dict Target value (supports multi-target)
w float Sample weight
criterion SplitCriterion Split evaluation criterion
Output Type Description
split_suggestion BranchFactory Best binary split with merit and statistics
removed_nodes int Number of bad split candidates removed

Usage Examples

from river.tree.splitter import EBSTSplitter
from river.tree.split_criterion import VarianceRatioSplitCriterion
from river.stats import Var

splitter = EBSTSplitter()

# Update with observations
splitter.update(5.5, 10.2, 1.0)
splitter.update(6.2, 12.5, 1.0)
splitter.update(4.8, 9.1, 1.0)

# Get best split
criterion = VarianceRatioSplitCriterion()
pre_split = Var()
pre_split.update(10.2, 1.0)
pre_split.update(12.5, 1.0)
pre_split.update(9.1, 1.0)

suggestion = splitter.best_evaluated_split_suggestion(
    criterion=criterion,
    pre_split_dist=pre_split,
    att_idx='feature1',
    binary_only=True
)

# Remove bad splits after failed split attempt
splitter.remove_bad_splits(
    criterion=criterion,
    last_check_ratio=0.95,
    last_check_vr=0.5,
    last_check_e=0.01,
    pre_split_dist=pre_split
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment