Principle:Online ml River Estimator Base Architecture

Knowledge Sources	Machine Learning Design Patterns
Domains	Online_Learning Software_Design API_Design
Last Updated	2026-02-08 18:00 GMT

Overview

The base estimator architecture defines a hierarchy of abstract interfaces for online machine learning models. Each interface specifies the minimal contract (methods and properties) that a particular type of estimator must satisfy, enabling polymorphic composition, evaluation, and testing across the entire framework.

Description

A well-designed online ML framework requires a consistent interface hierarchy that separates concerns by task type. The base architecture typically defines:

Base estimator: The root interface providing common functionality (cloning, parameter access, string representation).
Classifier: Adds predict_one(x) and predict_proba_one(x) for classification tasks.
Regressor: Adds predict_one(x) returning a continuous value.
Clusterer: Adds predict_one(x) returning a cluster label and manages cluster centers.
Transformer: Adds transform_one(x) for feature transformations.
Drift detector: Adds update(value) and exposes drift/warning state.
Ensemble: Manages a collection of sub-models with coordinated learning.
Multi-output: Handles tasks with multiple target variables simultaneously.
Wrapper: Delegates to an inner estimator while adding behavior (decorator pattern).

This hierarchy enables programming to interfaces: evaluation functions, pipelines, and meta-learners depend only on abstract types, not concrete implementations. Any model that satisfies the interface can be used interchangeably.

Usage

Use a base estimator architecture when:

You are designing or extending an online ML framework.
You need to write generic code that works with any classifier, regressor, or transformer.
You want type checking and validation that models implement the required methods.
You need to compose models into pipelines, ensembles, or meta-learners.

Theoretical Basis

The base estimator architecture applies two core design principles:

Interface Segregation Principle (ISP): Each task type defines only the methods it requires:

Base
  learn_one(x, y) -> self

Classifier(Base)
  predict_one(x) -> label
  predict_proba_one(x) -> dict[label, float]

Regressor(Base)
  predict_one(x) -> float

Transformer(Base)
  transform_one(x) -> dict
  learn_one(x) -> self     # unsupervised variant

Clusterer(Base)
  predict_one(x) -> int
  learn_one(x) -> self     # unsupervised

DriftDetector(Base)
  update(value) -> self
  drift_detected: bool

Liskov Substitution Principle (LSP): Any concrete implementation of an interface can be substituted wherever that interface is expected. This enables generic evaluation:

function evaluate(model: Classifier, stream):
    for x, y in stream:
        y_pred = model.predict_one(x)    # works for ANY classifier
        metric.update(y, y_pred)
        model.learn_one(x, y)

Mixin pattern: Some interfaces may be combined. For example, a model that is both a classifier and a transformer (e.g., for embedding extraction) implements both interfaces. This is achieved through multiple inheritance or mixin classes.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment