Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Online ml River Sklearn Compatibility

From Leeroopedia


Knowledge Sources Machine Learning Software Engineering
Domains Online_Learning Software_Design Interoperability
Last Updated 2026-02-08 18:00 GMT

Overview

Framework interoperability wrappers provide bidirectional compatibility between online (incremental) and batch machine learning APIs. They allow online models to be used within batch-oriented ecosystems (e.g., scikit-learn's cross-validation, grid search) and batch models to be used within streaming pipelines, bridging two fundamentally different learning paradigms.

Description

Online and batch ML frameworks differ in their core abstractions:

  • Batch frameworks (e.g., scikit-learn) expect fit(X, y) on a complete dataset and predict(X) on arrays. They support cross-validation, grid search, and other tools that assume the full dataset is available.
  • Online frameworks (e.g., River) expect learn_one(x, y) on individual instances and predict_one(x) on single dictionaries. They are designed for streaming, incremental updates.

Interoperability wrappers bridge this gap in both directions:

Online-to-batch (River-to-sklearn):

  • Wraps an online model to expose scikit-learn's fit/predict API.
  • fit(X, y) iterates through the dataset calling learn_one for each instance.
  • predict(X) calls predict_one for each instance and collects results into an array.
  • Enables use of scikit-learn's cross-validation, pipeline, and hyperparameter tuning tools.

Batch-to-online (sklearn-to-River):

  • Wraps a scikit-learn model (which supports partial_fit) to expose River's learn_one/predict_one API.
  • learn_one(x, y) calls partial_fit on a single-instance array.
  • predict_one(x) calls predict on a single-instance array and returns a scalar.
  • Enables use of scikit-learn's incremental learners within River's streaming pipelines.

Usage

Use framework interoperability wrappers when:

  • You want to use an online model with scikit-learn's evaluation tools.
  • You want to include a scikit-learn incremental learner in a River pipeline.
  • You need to benchmark online models against batch baselines.
  • You are migrating between frameworks and need transitional compatibility.

Theoretical Basis

Adapter pattern: Interoperability wrappers implement the adapter design pattern, translating one interface into another:

Online-to-Batch Adapter:
  fit(X, y):
      for (x_i, y_i) in zip(X, y):
          online_model.learn_one(dict(x_i), y_i)
      return self

  predict(X):
      return [online_model.predict_one(dict(x_i)) for x_i in X]
Batch-to-Online Adapter:
  learn_one(x, y):
      batch_model.partial_fit([list(x.values())], [y])
      return self

  predict_one(x):
      return batch_model.predict([list(x.values())])[0]

Data format translation: Online models use dictionaries ({"feature_a": 1.0, "feature_b": 2.0}) while batch models use arrays/DataFrames. Wrappers handle this conversion, including maintaining consistent feature ordering.

Partial fit requirement: The batch-to-online direction requires that the scikit-learn model supports partial_fit, which limits compatibility to incremental learners (e.g., SGDClassifier, MiniBatchKMeans). Models without partial_fit cannot be meaningfully wrapped for streaming use.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment