Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Online ml River Feature Normalization

From Leeroopedia


Knowledge Sources River River Docs
Domains Online Machine Learning, Data Preprocessing, Feature Engineering
Last Updated 2026-02-08 16:00 GMT

Overview

Online feature scaling technique that transforms features to a fixed [0, 1] range using running minimum and maximum statistics, enabling incremental normalization without storing the full dataset.

Description

Feature normalization is a critical preprocessing step in machine learning that rescales feature values to a common range. In the streaming (online) setting, traditional batch min-max scaling is not feasible because the full dataset is never available at once. Instead, online feature normalization maintains running statistics -- specifically a running minimum and a running maximum for each feature -- and uses these to incrementally transform each incoming observation.

The key advantage of online feature normalization is that it operates in constant memory and constant time per observation. As new data arrives, the running minimum and maximum are updated, and the transformation is applied using only these two statistics. This makes it suitable for data streams where observations arrive one at a time and the distribution of features may shift over time.

Online min-max scaling is particularly important as a preprocessing step for anomaly detection algorithms such as Half-Space Trees, which assume features are bounded in [0, 1]. Without proper normalization, such algorithms may produce unreliable anomaly scores.

Usage

Use online feature normalization when:

  • Features have different scales and need to be brought into a common [0, 1] range
  • Data arrives as a stream and batch normalization is not possible
  • A downstream model (e.g., Half-Space Trees) requires features in a bounded range
  • Memory is constrained and the full dataset cannot be stored
  • The feature distribution may shift over time (the running min/max will adapt)

Theoretical Basis

The min-max normalization formula for a single feature value x is:

x_scaled = (x - x_min) / (x_max - x_min)

Where:

  • x_min is the running minimum of the feature observed so far
  • x_max is the running maximum of the feature observed so far

Safe division: When x_max equals x_min (i.e., all observed values for a feature are identical), the denominator is zero. In this case, the transformation returns 0 to avoid division-by-zero errors.

Online update rules:

For each new observation x_t:
    x_min = min(x_min, x_t)
    x_max = max(x_max, x_t)
    x_scaled = safe_div(x_t - x_min, x_max - x_min)

Where safe_div(a, b) returns a / b if b != 0, otherwise returns 0.

Properties:

  • Time complexity: O(d) per observation, where d is the number of features
  • Space complexity: O(d) -- stores one Min and one Max statistic per feature
  • Output range: [0, 1] for each feature (guaranteed after at least two distinct values have been seen)

Pseudocode:

INIT:
    min_stats = {}    # running Min per feature
    max_stats = {}    # running Max per feature

LEARN_ONE(x):
    for each feature i in x:
        min_stats[i].update(x[i])
        max_stats[i].update(x[i])

TRANSFORM_ONE(x):
    result = {}
    for each feature i in x:
        result[i] = safe_div(x[i] - min_stats[i].get(), max_stats[i].get() - min_stats[i].get())
    return result

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment