Implementation:Online ml River Tree Splitter NominalClassif
| Knowledge Sources | |
|---|---|
| Domains | Online_Learning, Decision_Trees, Classification |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
Splitter for nominal (categorical) features in classification tasks using dictionary-based class count tracking.
Description
NominalSplitterClassif monitors categorical features by maintaining class counts for each feature value using dictionary structures. It supports both binary splits (testing equality to a specific value) and multiway splits (creating one branch per category). The splitter tracks all observed feature values and missing value weights, enabling conditional probability estimation for Naive Bayes.
Usage
Use NominalSplitterClassif when monitoring categorical features in classification trees. It automatically handles both binary and multiway split evaluation.
Code Reference
Source Location
- Repository: Online_ml_River
- File: river/tree/splitter/nominal_splitter_classif.py
Signature
class NominalSplitterClassif(Splitter):
def __init__(self):
...
@property
def is_numeric(self):
return False
def update(self, att_val, target_val, w):
...
def cond_proba(self, att_val, target_val):
...
def best_evaluated_split_suggestion(self, criterion, pre_split_dist, att_idx, binary_only):
...
Import
from river.tree.splitter import NominalSplitterClassif
I/O Contract
| Input | Type | Description |
|---|---|---|
| att_val | any | Categorical feature value |
| target_val | int/str | Class label |
| w | float | Sample weight |
| binary_only | bool | If True, only binary splits; else include multiway |
| Output | Type | Description |
|---|---|---|
| cond_proba | float | class) |
| split_suggestion | BranchFactory | Best split (binary or multiway) |
Usage Examples
from river.tree.splitter.nominal_splitter_classif import NominalSplitterClassif
from river.tree.split_criterion import GiniSplitCriterion
splitter = NominalSplitterClassif()
# Update with categorical observations
splitter.update('red', 'positive', 1.0)
splitter.update('blue', 'negative', 1.0)
splitter.update('red', 'positive', 1.0)
splitter.update('green', 'negative', 1.0)
# Get conditional probability
prob = splitter.cond_proba('red', 'positive')
print(f"P(red | positive) = {prob}")
# Get best binary split
criterion = GiniSplitCriterion()
pre_split = {'positive': 100, 'negative': 80}
binary_split = splitter.best_evaluated_split_suggestion(
criterion=criterion,
pre_split_dist=pre_split,
att_idx='color',
binary_only=True
)
# Get best multiway split
multiway_split = splitter.best_evaluated_split_suggestion(
criterion=criterion,
pre_split_dist=pre_split,
att_idx='color',
binary_only=False
)