Implementation:Online ml River Tree Splitter Base
| Knowledge Sources | |
|---|---|
| Domains | Online_Learning, Decision_Trees |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
Abstract base classes defining the interface for attribute observers (splitters) and feature quantizers in decision trees.
Description
This module provides two abstract base classes: Splitter and Quantizer. Splitter defines the interface for attribute observers that monitor input features and find optimal split points. They track statistics, estimate probability densities, and suggest splits based on criteria. Quantizer is used by Stochastic Gradient Trees to discretize features by tracking gradient and hessian statistics in bins.
Usage
Do not instantiate these base classes directly. Instead, implement concrete splitters (like GaussianSplitter, EBSTSplitter) or quantizers (like DynamicQuantizer, StaticQuantizer) that fulfill their interface contracts.
Code Reference
Source Location
- Repository: Online_ml_River
- File: river/tree/splitter/base.py
Signature
class Splitter(base.Estimator, abc.ABC):
@abc.abstractmethod
def update(self, att_val, target_val: base.typing.Target, w: float) -> None:
pass
@abc.abstractmethod
def cond_proba(self, att_val, target_val: base.typing.ClfTarget) -> float:
pass
@abc.abstractmethod
def best_evaluated_split_suggestion(
self,
criterion: SplitCriterion,
pre_split_dist: list | dict,
att_idx: base.typing.FeatureName,
binary_only: bool,
) -> BranchFactory:
pass
@property
def is_numeric(self) -> bool:
return True
@property
def is_target_class(self) -> bool:
return True
class Quantizer(base.Estimator, abc.ABC):
@abc.abstractmethod
def __len__(self):
pass
@abc.abstractmethod
def update(self, x_val, gh: GradHess, w: float) -> None:
pass
@abc.abstractmethod
def __iter__(self) -> tuple[float, typing.Iterator[GradHessStats]]:
pass
Import
from river.tree.splitter.base import Splitter
from river.tree.splitter.base import Quantizer
I/O Contract
| Input (Splitter) | Type | Description |
|---|---|---|
| att_val | any | Feature value |
| target_val | Target | Target value (class or numeric) |
| w | float | Sample weight |
| criterion | SplitCriterion | Split evaluation criterion |
| Output (Splitter) | Type | Description |
|---|---|---|
| cond_proba | float | target) conditional probability |
| split_suggestion | BranchFactory | Best split candidate with merit score |
Usage Examples
# Splitter is abstract, use concrete implementation
from river.tree.splitter import GaussianSplitter
from river.tree.split_criterion import GiniSplitCriterion
splitter = GaussianSplitter(n_splits=10)
# Update with observations
splitter.update(att_val=5.5, target_val='A', w=1.0)
splitter.update(att_val=6.2, target_val='B', w=1.0)
# Get conditional probability
prob = splitter.cond_proba(5.5, 'A')
# Find best split
criterion = GiniSplitCriterion()
pre_split = {'A': 10, 'B': 8}
suggestion = splitter.best_evaluated_split_suggestion(
criterion=criterion,
pre_split_dist=pre_split,
att_idx='feature1',
binary_only=True
)
print(f"Split merit: {suggestion.merit}")
print(f"Split threshold: {suggestion.split_info}")