Implementation:Online ml River Tree Splitter SGTQuantizer
| Knowledge Sources | |
|---|---|
| Domains | Online_Learning, Decision_Trees, Gradient_Boosting |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
Feature quantizers for Stochastic Gradient Trees that discretize features using dynamic or static quantization strategies.
Description
This module provides two quantization strategies for SGT. DynamicQuantizer starts with an initial radius and adapts it to the data's standard deviation for new quantizers. StaticQuantizer buffers initial samples to determine fixed quantization bins that are replicated to all new quantizers. Both track gradient and hessian statistics in bins. Dynamic quantization adapts to each feature's scale, while static quantization provides consistent bins across the tree.
Usage
Use DynamicQuantizer when features have varying scales and you want adaptive quantization. Use StaticQuantizer when you want consistent binning based on initial data distribution.
Code Reference
Source Location
- Repository: Online_ml_River
- File: river/tree/splitter/sgt_quantizer.py
Signature
class DynamicQuantizer(Quantizer):
def __init__(self, radius: float = 0.5, std_prop: float = 0.25):
...
def update(self, x_val, gh: GradHess, w: float):
...
def __len__(self):
...
def __iter__(self):
...
def _get_params(self):
...
class StaticQuantizer(Quantizer):
def __init__(self, n_bins: int = 64, warm_start: int = 100, *, buckets: list | None = None):
...
def update(self, x_val, gh: GradHess, w: float):
...
def __len__(self):
...
def __iter__(self):
...
def _get_params(self):
...
Import
from river.tree.splitter.sgt_quantizer import DynamicQuantizer
from river.tree.splitter.sgt_quantizer import StaticQuantizer
I/O Contract
| Input | Type | Description |
|---|---|---|
| x_val | float | Feature value |
| gh | GradHess | Gradient and hessian pair |
| w | float | Sample weight |
| radius | float | Initial quantization radius (DynamicQuantizer) |
| n_bins | int | Number of bins (StaticQuantizer) |
| warm_start | int | Warmup samples (StaticQuantizer) |
| Output | Type | Description |
|---|---|---|
| __len__ | int | Number of bins |
| __iter__ | Iterator[tuple] | (threshold, GradHessStats) pairs |
| _get_params | dict | Parameters for cloning |
Usage Examples
from river.tree.splitter.sgt_quantizer import DynamicQuantizer, StaticQuantizer
from river.tree.utils import GradHess
# Dynamic quantizer adapts to data scale
dq = DynamicQuantizer(radius=0.5, std_prop=0.25)
for i in range(100):
x_val = float(i) / 10
gh = GradHess(gradient=-0.1, hessian=0.2)
dq.update(x_val, gh, w=1.0)
print(f"Number of bins: {len(dq)}")
# Iterate over bins
for threshold, ghs in dq:
print(f"Threshold: {threshold}, Total weight: {ghs.total_weight}")
# Get adapted parameters for new quantizer
params = dq._get_params()
# Static quantizer uses fixed bins
sq = StaticQuantizer(n_bins=64, warm_start=100)
# Update during warmup (fills buffer)
for i in range(100):
gh = GradHess(gradient=-0.1, hessian=0.2)
sq.update(float(i), gh, w=1.0)
# After warmup, bins are fixed
print(f"Bins created: {len(sq)}")