Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Online ml River Preprocessing MinMaxScaler

From Leeroopedia


Knowledge Sources River River Docs
Domains Online Machine Learning, Data Preprocessing, Feature Engineering
Last Updated 2026-02-08 16:00 GMT

Overview

Concrete tool for performing online min-max feature scaling in the River library, transforming each feature to the [0, 1] range using incrementally maintained minimum and maximum statistics.

Description

The preprocessing.MinMaxScaler class implements online min-max normalization. It maintains a running stats.Min and stats.Max instance for each feature encountered in the data stream. On each call to learn_one, it updates these statistics. On each call to transform_one, it applies the min-max formula (x - min) / (max - min) to each feature, using a safe division that returns 0 when the denominator is zero.

The class inherits from base.Transformer, making it composable with other River transformers and models via pipelines. It requires no constructor parameters and automatically initializes statistics for new features as they are encountered.

Usage

Import and use preprocessing.MinMaxScaler when:

  • You need to normalize features to [0, 1] in a streaming context
  • You are building a pipeline with anomaly detectors like anomaly.HalfSpaceTrees that expect bounded features
  • You want a zero-configuration scaler that adapts to the data distribution

Code Reference

Source Location

river/preprocessing/scale.py, lines 252-306.

Signature

class MinMaxScaler(base.Transformer):
    def __init__(self):
        self.min = collections.defaultdict(stats.Min)
        self.max = collections.defaultdict(stats.Max)

Import

from river import preprocessing
scaler = preprocessing.MinMaxScaler()

Parameters

No constructor parameters. Internally initializes:

  • self.min -- defaultdict of stats.Min instances, one per feature
  • self.max -- defaultdict of stats.Max instances, one per feature

Methods

  • learn_one(x: dict) -> None -- Updates running min and max statistics for each feature in x.
  • transform_one(x: dict) -> dict -- Returns a new dict with each feature scaled to [0, 1].

I/O Contract

Inputs

Parameter Type Description
x dict A dictionary mapping feature names to numeric values.

Outputs

Method Return Type Description
learn_one None Updates internal min/max statistics; no return value.
transform_one dict Dictionary with the same keys as input; each value is scaled to [0, 1].

Usage Examples

Basic usage with random data:

import random
from river import preprocessing

random.seed(42)
X = [{'x': random.uniform(8, 12)} for _ in range(5)]

scaler = preprocessing.MinMaxScaler()

for x in X:
    scaler.learn_one(x)
    print(scaler.transform_one(x))
# {'x': 0.0}
# {'x': 0.0}
# {'x': 0.406920}
# {'x': 0.322582}
# {'x': 1.0}

In a pipeline with Half-Space Trees for anomaly detection:

from river import compose, preprocessing, anomaly, datasets, metrics

model = compose.Pipeline(
    preprocessing.MinMaxScaler(),
    anomaly.HalfSpaceTrees(seed=42)
)

auc = metrics.ROCAUC()

for x, y in datasets.CreditCard().take(2500):
    score = model.score_one(x)
    model.learn_one(x)
    auc.update(y, score)

print(auc)
# ROCAUC: 91.15%

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment