Implementation:Online ml River Tree Splitter TEBST
| Knowledge Sources | |
|---|---|
| Domains | Online_Learning, Decision_Trees, Regression |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
Truncated E-BST splitter that rounds feature values before storage to reduce memory and processing time.
Description
TEBSTSplitter is a memory-optimized variant of EBSTSplitter that rounds incoming feature values to a specified number of decimal places before inserting them into the binary search tree. This reduces the number of unique nodes by merging small variations into the same value. The trade-off is slightly coarser split candidates in exchange for improved memory efficiency and faster processing.
Usage
Use TEBSTSplitter when memory or processing time is constrained and you can tolerate rounding numerical features. Set digits parameter based on required precision versus memory constraints.
Code Reference
Source Location
- Repository: Online_ml_River
- File: river/tree/splitter/tebst_splitter.py
Signature
class TEBSTSplitter(EBSTSplitter):
def __init__(self, digits: int = 1):
...
def update(self, att_val, target_val, w):
...
def cond_proba(self, att_val, target_val):
raise NotImplementedError
Import
from river.tree.splitter import TEBSTSplitter
I/O Contract
| Input | Type | Description |
|---|---|---|
| digits | int | Decimal places for rounding (default 1) |
| att_val | float | Numerical feature value to be rounded |
| target_val | float/dict | Target value (supports multi-target) |
| w | float | Sample weight |
| Output | Type | Description |
|---|---|---|
| split_suggestion | BranchFactory | Best split using rounded values |
Usage Examples
from river.tree.splitter import TEBSTSplitter
from river.tree.split_criterion import VarianceRatioSplitCriterion
from river.stats import Var
# Create truncated E-BST with 2 decimal places
splitter = TEBSTSplitter(digits=2)
# Update with precise values (will be rounded)
splitter.update(5.5234, 10.2, 1.0)
splitter.update(5.5189, 12.5, 1.0) # Rounded to same value as above
splitter.update(4.8123, 9.1, 1.0)
# The tree has fewer nodes due to rounding
# 5.5234 and 5.5189 both become 5.52
# Get best split
criterion = VarianceRatioSplitCriterion()
pre_split = Var()
pre_split.update(10.2, 1.0)
pre_split.update(12.5, 1.0)
pre_split.update(9.1, 1.0)
suggestion = splitter.best_evaluated_split_suggestion(
criterion=criterion,
pre_split_dist=pre_split,
att_idx='feature1',
binary_only=True
)
print(f"Threshold: {suggestion.split_info}")
# Compare with regular EBST
from river.tree.splitter import EBSTSplitter
regular = EBSTSplitter()
# Would create more nodes for same data