Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Microsoft Onnxruntime Sklearn Conversion Environment

From Leeroopedia


Field Value
sources docs/python/examples/plot_train_convert_predict.py, docs/python/requirements.txt
domains scikit-learn, model-conversion, onnx, inference, validation
last_updated 2026-02-10

Overview

Python environment for training scikit-learn models, converting them to ONNX format with skl2onnx, and validating predictions using onnxruntime.

Description

The Sklearn Conversion Environment supports the complete workflow of training a machine learning model with scikit-learn, converting it to the ONNX format, and running validation inference through ONNX Runtime. The conversion pipeline uses the skl2onnx library, which provides the convert_sklearn() function to translate trained scikit-learn estimators (such as LogisticRegression, RandomForestClassifier, GradientBoostingRegressor, etc.) into ONNX graphs. Input types are specified using FloatTensorType to define the expected tensor shape and data type. Once converted, the ONNX model is loaded into an InferenceSession for prediction, enabling side-by-side validation against the original scikit-learn model. This workflow is demonstrated in plot_train_convert_predict.py, which serves as both a runnable example and a Sphinx-gallery documentation page. The documentation build environment additionally requires sphinx, matplotlib, and related packages as listed in docs/python/requirements.txt.

Usage

Use this environment whenever you need to:

  • Convert a trained scikit-learn model to ONNX format for deployment.
  • Validate that the ONNX-converted model produces outputs matching the original scikit-learn model.
  • Deploy scikit-learn models into ONNX Runtime-based inference pipelines.
  • Generate documentation or tutorials for the sklearn-to-ONNX conversion workflow.

System Requirements

Requirement Minimum Recommended
Python 3.10 3.12
Operating System Linux, Windows, macOS Any
RAM 2 GB 8 GB (for larger datasets)
Disk 500 MB 1 GB (with documentation build)

Dependencies

System Packages

No additional system packages are required beyond a standard Python installation. All dependencies are Python packages.

Python Packages

Package Version Constraint Purpose
scikit-learn (latest) Model training (e.g., LogisticRegression, RandomForest)
skl2onnx (latest) Conversion of sklearn models to ONNX via convert_sklearn()
numpy >= 1.21.6 Data array construction and manipulation
onnxruntime 1.25.0 Inference on converted ONNX models
onnx (latest) ONNX model format library (dependency of skl2onnx)

Documentation Build Dependencies (docs/python/requirements.txt)

Package Purpose
sphinx Documentation generator
matplotlib Plot generation for documentation examples
sphinx-gallery Auto-generates documentation pages from example scripts
numpydoc NumPy-style docstring rendering in Sphinx

Credentials

No credentials, API keys, or environment variables are required for this environment.

Quick Install

pip install scikit-learn skl2onnx onnxruntime numpy

For documentation build:

pip install -r docs/python/requirements.txt

Verify installation:

python -c "import sklearn; import skl2onnx; import onnxruntime; print('All packages loaded successfully')"

Code Evidence

Sklearn model training and conversion (plot_train_convert_predict.py)

# docs/python/examples/plot_train_convert_predict.py
from sklearn.linear_model import LogisticRegression
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType

# Train a scikit-learn model
model = LogisticRegression()
model.fit(X_train, y_train)

# Define the input type for conversion
initial_type = [('float_input', FloatTensorType([None, X_train.shape[1]]))]

# Convert the trained model to ONNX format
onnx_model = convert_sklearn(model, initial_types=initial_type)

This snippet from the example script demonstrates the core three-step workflow: train a scikit-learn estimator, define input types using FloatTensorType, and convert to ONNX with convert_sklearn().

ONNX Runtime validation inference (plot_train_convert_predict.py)

# docs/python/examples/plot_train_convert_predict.py
import onnxruntime as rt

# Load the converted ONNX model into an InferenceSession
session = rt.InferenceSession(onnx_model.SerializeToString())

# Run inference using the ONNX Runtime session
input_name = session.get_inputs()[0].name
pred_onnx = session.run(None, {input_name: X_test.astype(numpy.float32)})

After conversion, the ONNX model is loaded into an InferenceSession and used for prediction, allowing direct comparison against model.predict(X_test) from scikit-learn.

FloatTensorType for input specification (plot_train_convert_predict.py)

# docs/python/examples/plot_train_convert_predict.py
from skl2onnx.common.data_types import FloatTensorType

# FloatTensorType([None, n_features]) specifies:
#   - None: dynamic batch dimension
#   - n_features: fixed feature count matching training data
initial_type = [('float_input', FloatTensorType([None, 4]))]

Common Errors

Error Cause Solution
ModuleNotFoundError: No module named 'skl2onnx' skl2onnx not installed Run pip install skl2onnx
RuntimeError: Unsupported sklearn operator The sklearn estimator type is not supported by skl2onnx Check the skl2onnx supported operators list and update skl2onnx to the latest version
InvalidArgument: input tensor data type mismatch Input data passed as float64 instead of float32 Cast input: X_test.astype(numpy.float32)
RuntimeError: shape mismatch Number of features in input does not match the model Ensure FloatTensorType([None, n_features]) matches training data dimensions
ValueError: initial_types cannot be None convert_sklearn() called without specifying input types Always provide initial_types parameter with FloatTensorType definitions
ImportError: cannot import name 'FloatTensorType' Outdated version of skl2onnx Upgrade: pip install --upgrade skl2onnx

Compatibility Notes

  • scikit-learn versions: The skl2onnx converter supports scikit-learn 1.0 and later. Some newer estimators or parameters may require updating skl2onnx to the latest version.
  • ONNX opset versions: The default target opset for conversion depends on the skl2onnx version. You can specify a target opset explicitly: convert_sklearn(model, initial_types=initial_type, target_opset=18).
  • Data types: ONNX Runtime expects float32 input by default. Always cast NumPy arrays to float32 before running inference, even if scikit-learn used float64 internally.
  • Pipelines: Complete scikit-learn Pipeline objects (including preprocessors like StandardScaler, OneHotEncoder) can be converted as a single unit.
  • Custom estimators: Custom sklearn-compatible estimators require registering a custom converter with skl2onnx before conversion.
  • Cross-platform: The converted ONNX model is platform-independent and can be deployed on any OS or runtime that supports ONNX Runtime.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment