Implementation:Online ml River Tree Splitter QO
| Knowledge Sources | |
|---|---|
| Domains | Online_Learning, Decision_Trees, Regression |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
Quantization Observer (QO) splitter that uses hash-based quantization to discretize numerical features for efficient split evaluation in regression trees.
Description
QOSplitter implements hash-based feature quantization for regression trees. It discretizes incoming feature values into equal-sized intervals defined by a radius parameter. Split candidates are the midpoints between consecutive hash slots. The splitter supports both binary and multiway splits. Memory and computation scale with the radius: smaller radius means more slots and finer discretization but higher costs. Assumes features are scaled to similar ranges.
Usage
Use QOSplitter for regression when you want efficient memory-bounded split evaluation. Scale features first using preprocessing.StandardScaler, and set radius as a proportion of standard deviation (default 0.25).
Code Reference
Source Location
- Repository: Online_ml_River
- File: river/tree/splitter/qo_splitter.py
Signature
class QOSplitter(Splitter):
def __init__(self, radius: float = 0.25, allow_multiway_splits=False):
...
def update(self, att_val, target_val, w):
...
def cond_proba(self, att_val, target_val):
raise NotImplementedError
def best_evaluated_split_suggestion(self, criterion, pre_split_dist, att_idx, binary_only=True):
...
@property
def is_target_class(self) -> bool:
return False
class FeatureQuantizer:
def __init__(self, radius: float):
...
def update(self, x: float, y: float | utils.VectorDict, weight: float):
...
def __iter__(self):
...
Import
from river.tree.splitter import QOSplitter
I/O Contract
| Input | Type | Description |
|---|---|---|
| att_val | float | Numerical feature value |
| target_val | float/dict | Target value (supports multi-target) |
| w | float | Sample weight |
| radius | float | Quantization radius (default 0.25) |
| allow_multiway_splits | bool | Enable multiway splits (default False) |
| Output | Type | Description |
|---|---|---|
| split_suggestion | BranchFactory | Best split (binary or multiway) with statistics |
Usage Examples
from river.tree.splitter import QOSplitter
from river.tree.split_criterion import VarianceRatioSplitCriterion
from river.stats import Var
from river import preprocessing
# Scale features first
scaler = preprocessing.StandardScaler()
# Create QO splitter with custom radius
splitter = QOSplitter(radius=0.5, allow_multiway_splits=True)
# Update with scaled observations
for x_val, y_val in [(0.5, 25.5), (0.8, 30.2), (0.3, 22.1)]:
splitter.update(x_val, y_val, w=1.0)
# Get best split
criterion = VarianceRatioSplitCriterion()
pre_split = Var()
for y in [25.5, 30.2, 22.1]:
pre_split.update(y, 1.0)
suggestion = splitter.best_evaluated_split_suggestion(
criterion=criterion,
pre_split_dist=pre_split,
att_idx='scaled_feature',
binary_only=False
)
print(f"Split: {suggestion.split_info}, Merit: {suggestion.merit}")