Overview
The Classifier class is an abstract base class that defines the interface for all classification algorithms in River, supporting both single-instance and mini-batch learning.
Description
The Classifier class extends Estimator to provide the standard interface for classification models in River. It defines two abstract and concrete methods: learn_one for updating the model with a single labeled example, predict_proba_one for predicting class probabilities, and predict_one for predicting the most likely class label. The class includes a default implementation of predict_one that selects the class with maximum probability from predict_proba_one. The MiniBatchClassifier subclass extends this interface to support vectorized operations on pandas DataFrames through learn_many, predict_proba_many, and predict_many methods.
Usage
Use Classifier as the parent class when implementing new classification algorithms that learn from individual examples. Extend MiniBatchClassifier instead if your algorithm can efficiently process multiple examples simultaneously. All classifiers must implement learn_one and at least one of predict_proba_one or predict_one methods.
Code Reference
Source Location
Signature
class Classifier(estimator.Estimator):
"""A classifier."""
@abc.abstractmethod
def learn_one(self, x: dict[base.typing.FeatureName, Any], y: base.typing.ClfTarget) -> None
def predict_proba_one(
self,
x: dict[base.typing.FeatureName, Any],
**kwargs: Any
) -> dict[base.typing.ClfTarget, float]
def predict_one(
self,
x: dict[base.typing.FeatureName, Any],
**kwargs: Any
) -> base.typing.ClfTarget | None
@property
def _multiclass(self) -> bool
@property
def _supervised(self) -> bool
class MiniBatchClassifier(Classifier):
"""A classifier that can operate on mini-batches."""
@abc.abstractmethod
def learn_many(self, X: pd.DataFrame, y: pd.Series) -> None
def predict_proba_many(self, X: pd.DataFrame) -> pd.DataFrame
def predict_many(self, X: pd.DataFrame) -> pd.Series
Import
from river.base import Classifier, MiniBatchClassifier
I/O Contract
Classifier.learn_one
| Parameter |
Type |
Description
|
| x |
dict[FeatureName, Any] |
Dictionary of features
|
| y |
ClfTarget |
The true class label
|
Classifier.predict_proba_one
| Parameter |
Type |
Description
|
| x |
dict[FeatureName, Any] |
Dictionary of features
|
| **kwargs |
Any |
Additional algorithm-specific arguments
|
| Returns |
Type |
Description
|
| probabilities |
dict[ClfTarget, float] |
Dictionary mapping each class label to its probability
|
Classifier.predict_one
| Parameter |
Type |
Description
|
| x |
dict[FeatureName, Any] |
Dictionary of features
|
| **kwargs |
Any |
Additional algorithm-specific arguments
|
| Returns |
Type |
Description
|
| prediction |
None |
The predicted class label, or None if no prediction can be made
|
MiniBatchClassifier Methods
| Method |
Input |
Output |
Description
|
| learn_many |
X: DataFrame, y: Series |
None |
Update model with multiple examples
|
| predict_proba_many |
X: DataFrame |
DataFrame |
Predict probabilities for multiple examples
|
| predict_many |
X: DataFrame |
Series |
Predict labels for multiple examples
|
Usage Examples
from river import tree
from river import datasets
# Create a classifier
model = tree.HoeffdingTreeClassifier()
# Single instance learning
for x, y in datasets.Phishing():
# Get probability predictions
y_proba = model.predict_proba_one(x)
# Get class prediction
y_pred = model.predict_one(x)
# Update the model
model.learn_one(x, y)
# Implementing a custom classifier
from river.base import Classifier
class MyClassifier(Classifier):
def __init__(self):
self.classes = {}
def learn_one(self, x, y):
# Update model with one example
if y not in self.classes:
self.classes[y] = 0
self.classes[y] += 1
def predict_proba_one(self, x):
# Return probability distribution
total = sum(self.classes.values())
return {
label: count / total
for label, count in self.classes.items()
}
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.