Implementation:Online ml River Base Classifier

Knowledge Sources	Online_ml_River
Domains	Online_Learning, Classification, Base_Classes
Last Updated	2026-02-08 16:00 GMT

Overview

The Classifier class is an abstract base class that defines the interface for all classification algorithms in River, supporting both single-instance and mini-batch learning.

Description

The Classifier class extends Estimator to provide the standard interface for classification models in River. It defines two abstract and concrete methods: learn_one for updating the model with a single labeled example, predict_proba_one for predicting class probabilities, and predict_one for predicting the most likely class label. The class includes a default implementation of predict_one that selects the class with maximum probability from predict_proba_one. The MiniBatchClassifier subclass extends this interface to support vectorized operations on pandas DataFrames through learn_many, predict_proba_many, and predict_many methods.

Usage

Use Classifier as the parent class when implementing new classification algorithms that learn from individual examples. Extend MiniBatchClassifier instead if your algorithm can efficiently process multiple examples simultaneously. All classifiers must implement learn_one and at least one of predict_proba_one or predict_one methods.

Code Reference

Source Location

Repository: Online_ml_River
File: river/base/classifier.py

Signature

class Classifier(estimator.Estimator):
    """A classifier."""

    @abc.abstractmethod
    def learn_one(self, x: dict[base.typing.FeatureName, Any], y: base.typing.ClfTarget) -> None

    def predict_proba_one(
        self,
        x: dict[base.typing.FeatureName, Any],
        **kwargs: Any
    ) -> dict[base.typing.ClfTarget, float]

    def predict_one(
        self,
        x: dict[base.typing.FeatureName, Any],
        **kwargs: Any
    ) -> base.typing.ClfTarget | None

    @property
    def _multiclass(self) -> bool

    @property
    def _supervised(self) -> bool


class MiniBatchClassifier(Classifier):
    """A classifier that can operate on mini-batches."""

    @abc.abstractmethod
    def learn_many(self, X: pd.DataFrame, y: pd.Series) -> None

    def predict_proba_many(self, X: pd.DataFrame) -> pd.DataFrame

    def predict_many(self, X: pd.DataFrame) -> pd.Series

Import

from river.base import Classifier, MiniBatchClassifier

I/O Contract

Classifier.learn_one

Parameter	Type	Description
x	dict[FeatureName, Any]	Dictionary of features
y	ClfTarget	The true class label

Classifier.predict_proba_one

Parameter	Type	Description
x	dict[FeatureName, Any]	Dictionary of features
**kwargs	Any	Additional algorithm-specific arguments

Returns	Type	Description
probabilities	dict[ClfTarget, float]	Dictionary mapping each class label to its probability

Classifier.predict_one

Parameter	Type	Description
x	dict[FeatureName, Any]	Dictionary of features
**kwargs	Any	Additional algorithm-specific arguments

Returns	Type	Description
prediction	None	The predicted class label, or None if no prediction can be made

MiniBatchClassifier Methods

Method	Input	Output	Description
learn_many	X: DataFrame, y: Series	None	Update model with multiple examples
predict_proba_many	X: DataFrame	DataFrame	Predict probabilities for multiple examples
predict_many	X: DataFrame	Series	Predict labels for multiple examples

Usage Examples

from river import tree
from river import datasets

# Create a classifier
model = tree.HoeffdingTreeClassifier()

# Single instance learning
for x, y in datasets.Phishing():
    # Get probability predictions
    y_proba = model.predict_proba_one(x)

    # Get class prediction
    y_pred = model.predict_one(x)

    # Update the model
    model.learn_one(x, y)

# Implementing a custom classifier
from river.base import Classifier

class MyClassifier(Classifier):
    def __init__(self):
        self.classes = {}

    def learn_one(self, x, y):
        # Update model with one example
        if y not in self.classes:
            self.classes[y] = 0
        self.classes[y] += 1

    def predict_proba_one(self, x):
        # Return probability distribution
        total = sum(self.classes.values())
        return {
            label: count / total
            for label, count in self.classes.items()
        }

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment