Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Preprocessing OrdinalEncoder

From Leeroopedia
Revision as of 16:10, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Online_ml_River_Preprocessing_OrdinalEncoder.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Online_Learning, Preprocessing, Categorical_Encoding
Last Updated 2026-02-08 16:00 GMT

Overview

Encodes categorical features as integers in streaming fashion with configurable handling of unknown and null values.

Description

OrdinalEncoder maps each unique category within a feature to a unique integer code. Categories are assigned incrementally as they are first encountered, using auto-incrementing counters per feature. Unknown categories (not yet seen) can be mapped to a configurable value, and None values are handled separately with their own code. The encoder maintains category-to-code mappings internally and supports both single-observation and mini-batch processing through pandas DataFrames.

Usage

Use this when you need a simple integer encoding for categorical variables, particularly for tree-based models that can handle ordinal encodings directly. More memory-efficient than one-hot encoding for high-cardinality features. The unknown_value parameter allows graceful handling of new categories at inference time. Useful when the categorical order doesn't matter but you need numeric representation.

Code Reference

Source Location

Signature

class OrdinalEncoder(base.MiniBatchTransformer):
    def __init__(
        self,
        unknown_value: int | None = 0,
        none_value: int = -1,
    )

Import

from river import preprocessing

I/O Contract

Input Output
Dict[str, Any] - Categorical features Dict[str, int] - Integer-encoded features

Usage Examples

from river import preprocessing

X = [
    {"country": "France", "place": "Taco Bell"},
    {"country": None, "place": None},
    {"country": "Sweden", "place": "Burger King"},
    {"country": "France", "place": "Burger King"},
    {"country": "Russia", "place": "Starbucks"},
    {"country": "Russia", "place": "Starbucks"},
    {"country": "Sweden", "place": "Taco Bell"},
    {"country": None, "place": None},
]

encoder = preprocessing.OrdinalEncoder()
for x in X:
    print(encoder.transform_one(x))
    encoder.learn_one(x)
# {'country': 0, 'place': 0}
# {'country': -1, 'place': -1}
# {'country': 0, 'place': 0}
# {'country': 1, 'place': 2}
# {'country': 0, 'place': 0}
# {'country': 3, 'place': 3}
# {'country': 2, 'place': 1}
# {'country': -1, 'place': -1}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment