Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Tree Splitter Gaussian

From Leeroopedia


Knowledge Sources
Domains Online_Learning, Decision_Trees, Classification
Last Updated 2026-02-08 16:00 GMT

Overview

Gaussian-based attribute observer for classification that approximates class distributions using Gaussian estimators for probability density calculation.

Description

GaussianSplitter approximates the distribution of each class for a numerical feature using Gaussian (normal) distributions. This enables efficient probability density function calculation for Naive Bayes predictions. The splitter tracks minimum and maximum values per class and suggests split candidates by partitioning the feature range into equal-sized bins. Split evaluation uses the CDF of Gaussian distributions to estimate class distributions.

Usage

Use GaussianSplitter for classification tasks when features approximately follow normal distributions within each class and when Naive Bayes leaf models are used.

Code Reference

Source Location

Signature

class GaussianSplitter(Splitter):
    def __init__(self, n_splits: int = 10):
        ...

    def update(self, att_val, target_val, w):
        ...

    def cond_proba(self, att_val, target_val):
        ...

    def best_evaluated_split_suggestion(self, criterion, pre_split_dist, att_idx, binary_only):
        ...

Import

from river.tree.splitter import GaussianSplitter

I/O Contract

Input Type Description
att_val float Numerical feature value
target_val int/str Class label
w float Sample weight
n_splits int Number of split candidates to evaluate (default 10)
Output Type Description
cond_proba float class) from Gaussian PDF
split_suggestion BranchFactory Best split with estimated post-split distributions

Usage Examples

from river.tree.splitter import GaussianSplitter
from river.tree.split_criterion import GiniSplitCriterion

# Create splitter with 15 split candidates
splitter = GaussianSplitter(n_splits=15)

# Update with observations
splitter.update(5.5, 'cat', 1.0)
splitter.update(6.2, 'dog', 1.0)
splitter.update(5.8, 'cat', 1.0)
splitter.update(7.1, 'dog', 1.0)

# Get conditional probability for Naive Bayes
prob = splitter.cond_proba(att_val=5.5, target_val='cat')
print(f"P(5.5 | cat) = {prob}")

# Get best split
criterion = GiniSplitCriterion()
pre_split = {'cat': 100, 'dog': 80}

suggestion = splitter.best_evaluated_split_suggestion(
    criterion=criterion,
    pre_split_dist=pre_split,
    att_idx='height',
    binary_only=True
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment