Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Base Transformer

From Leeroopedia


Knowledge Sources
Domains Online_Learning, Feature_Engineering, Base_Classes
Last Updated 2026-02-08 16:00 GMT

Overview

The Transformer classes define the interface for feature transformation components in River, including unsupervised transformers, supervised transformers, and their mini-batch variants.

Description

River provides multiple transformer base classes to handle different transformation scenarios. BaseTransformer defines operator overloading for composing transformers using + (TransformerUnion) and * (TransformerProduct or Grouper) operators, and requires implementing transform_one to transform feature dictionaries. Transformer extends this for unsupervised transformations with an optional learn_one method. SupervisedTransformer is for transformations that require target values during learning. MiniBatchTransformer and MiniBatchSupervisedTransformer extend these with transform_many and learn_many methods for efficient batch processing of pandas DataFrames.

Usage

Use Transformer for unsupervised feature transformations like scaling or encoding. Use SupervisedTransformer when your transformation needs access to target values during learning, such as target encoding. Use the MiniBatch variants when your transformer can efficiently process multiple examples simultaneously. All transformers must implement transform_one (and transform_many for mini-batch versions).

Code Reference

Source Location

Signature

class BaseTransformer:
    """Base functionality for transformers."""

    def __add__(self, other: BaseTransformer) -> compose.TransformerUnion
    def __radd__(self, other: BaseTransformer) -> compose.TransformerUnion
    def __mul__(
        self,
        other: BaseTransformer | compose.Pipeline | FeatureName | list[FeatureName]
    ) -> compose.Grouper | compose.TransformerProduct
    def __rmul__(
        self,
        other: BaseTransformer | compose.Pipeline | FeatureName | list[FeatureName]
    ) -> compose.Grouper | compose.TransformerProduct

    @abc.abstractmethod
    def transform_one(self, x: dict[FeatureName, Any]) -> dict[FeatureName, Any]


class Transformer(base.Estimator, BaseTransformer):
    """A transformer."""

    @property
    def _supervised(self) -> bool

    def learn_one(self, x: dict[FeatureName, Any]) -> None


class SupervisedTransformer(base.Estimator, BaseTransformer):
    """A supervised transformer."""

    @property
    def _supervised(self) -> bool

    def learn_one(self, x: dict[FeatureName, Any], y: base.typing.Target) -> None


class MiniBatchTransformer(Transformer):
    """A transform that can operate on mini-batches."""

    @abc.abstractmethod
    def transform_many(self, X: pd.DataFrame) -> pd.DataFrame

    def learn_many(self, X: pd.DataFrame) -> None


class MiniBatchSupervisedTransformer(Transformer):
    """A supervised transformer that can operate on mini-batches."""

    @property
    def _supervised(self) -> bool

    @abc.abstractmethod
    def learn_many(self, X: pd.DataFrame, y: pd.Series) -> None

    @abc.abstractmethod
    def transform_many(self, X: pd.DataFrame) -> pd.DataFrame

Import

from river.base import Transformer, SupervisedTransformer
from river.base import MiniBatchTransformer, MiniBatchSupervisedTransformer

I/O Contract

transform_one

Parameter Type Description
x dict[FeatureName, Any] Dictionary of features to transform
Returns Type Description
transformed dict[FeatureName, Any] Dictionary of transformed features

Transformer.learn_one

Parameter Type Description
x dict[FeatureName, Any] Dictionary of features to learn from (unsupervised)

SupervisedTransformer.learn_one

Parameter Type Description
x dict[FeatureName, Any] Dictionary of features to learn from
y Target The target value (supervised)

MiniBatch Methods

Method Input Output Description
transform_many X: DataFrame DataFrame Transform multiple examples at once
learn_many X: DataFrame, y: Series (supervised) None Update from multiple examples

Usage Examples

from river import preprocessing
from river import feature_extraction
from river import compose
from river import datasets

# Create transformers
scaler = preprocessing.StandardScaler()
poly = feature_extraction.PolynomialExtender(degree=2)

# Compose transformers with + operator (TransformerUnion)
union = scaler + poly

# Compose transformers with * operator (TransformerProduct)
product = scaler * poly

# Use a transformer in a pipeline
model = scaler | preprocessing.MinMaxScaler()

# Single instance transformation
for x, y in datasets.TrumpApproval().take(10):
    # Transform features
    x_transformed = scaler.transform_one(x)

    # Learn from features
    scaler.learn_one(x)

# Implementing a custom transformer
from river.base import Transformer

class AddConstant(Transformer):
    def __init__(self, value=1.0):
        self.value = value

    def transform_one(self, x):
        # Add constant to all features
        return {k: v + self.value for k, v in x.items()}

    def learn_one(self, x):
        # Stateless, no learning needed
        pass

# Implementing a supervised transformer
from river.base import SupervisedTransformer

class TargetScaler(SupervisedTransformer):
    def __init__(self):
        self.mean = 0.0
        self.n = 0

    def learn_one(self, x, y):
        # Learn from target
        self.n += 1
        self.mean += (y - self.mean) / self.n

    def transform_one(self, x):
        # Transform based on learned statistics
        return {k: v / self.mean if self.mean != 0 else v for k, v in x.items()}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment