Implementation:Online ml River Tree Splitter Exhaustive
| Knowledge Sources | |
|---|---|
| Domains | Online_Learning, Decision_Trees, Classification |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
Exhaustive attribute observer for classification that stores all observations in a binary search tree to evaluate all possible split points.
Description
ExhaustiveSplitter implements the original attribute observer from Domingos and Hulten's VFDT algorithm. It uses a binary search tree to store all feature values and class counts between split attempts, enabling exhaustive evaluation of split candidates. Each node stores left and right class distributions. This splitter cannot perform probability density estimation and does not work well with Naive Bayes leaf models.
Usage
Use ExhaustiveSplitter for classification tasks when you want to evaluate all possible split points. It provides thorough split evaluation but has higher memory requirements than approximate methods.
Code Reference
Source Location
- Repository: Online_ml_River
- File: river/tree/splitter/exhaustive_splitter.py
Signature
class ExhaustiveSplitter(Splitter):
def __init__(self):
...
def update(self, att_val, target_val, w):
...
def cond_proba(self, att_val, target_val):
return 0.0
def best_evaluated_split_suggestion(self, criterion, pre_split_dist, att_idx, binary_only):
...
class ExhaustiveNode:
def __init__(self, att_val, target_val, w):
...
def insert_value(self, val, label, w):
...
Import
from river.tree.splitter import ExhaustiveSplitter
I/O Contract
| Input | Type | Description |
|---|---|---|
| att_val | float | Numerical feature value |
| target_val | int/str | Class label |
| w | float | Sample weight |
| criterion | SplitCriterion | Split evaluation criterion |
| Output | Type | Description |
|---|---|---|
| split_suggestion | BranchFactory | Best split with merit and post-split distributions |
| cond_proba | float | Always 0.0 (no density estimation) |
Usage Examples
from river.tree.splitter import ExhaustiveSplitter
from river.tree.split_criterion import GiniSplitCriterion
splitter = ExhaustiveSplitter()
# Update with observations
splitter.update(5.5, 'A', 1.0)
splitter.update(6.2, 'B', 1.0)
splitter.update(4.8, 'A', 1.0)
splitter.update(7.1, 'B', 1.0)
# Get best split
criterion = GiniSplitCriterion()
pre_split = {'A': 10, 'B': 8}
suggestion = splitter.best_evaluated_split_suggestion(
criterion=criterion,
pre_split_dist=pre_split,
att_idx='feature1',
binary_only=True
)
print(f"Best threshold: {suggestion.split_info}")
print(f"Merit: {suggestion.merit}")
print(f"Left distribution: {suggestion.children_stats[0]}")
print(f"Right distribution: {suggestion.children_stats[1]}")