Principle:Microsoft Onnxruntime Source Framework Training

Metadata

Field	Value
Principle Name	Source_Framework_Training
Repository	Microsoft_Onnxruntime
Source Repository	https://github.com/microsoft/onnxruntime
Domain	ML_Inference, Model_Conversion
Last Updated	2026-02-10
Workflow	Train_Convert_Predict
Pair	1 of 5

Overview

Training a machine learning model in a source framework (e.g., scikit-learn) before conversion to ONNX format.

Description

The first step in the ONNX conversion workflow is training a model using a familiar ML framework. The trained model captures learned parameters (weights, biases, thresholds) that will be preserved during conversion to ONNX format for optimized inference.

This is an External Tool Doc for scikit-learn. The ONNX Runtime ecosystem supports conversion from multiple source frameworks, with scikit-learn being one of the most common for classical machine learning models.

The training process follows the standard scikit-learn pattern:

Load and prepare the dataset (features and labels).
Split into training and test sets using train_test_split.
Instantiate a model estimator (e.g., LogisticRegression, RandomForestClassifier).
Fit the model on training data using the .fit() method.
Validate predictions on the test set before conversion.

The training workflow is demonstrated at docs/python/examples/plot_train_convert_predict.py:L22-34 for LogisticRegression and L178-181 for RandomForestClassifier.

Theoretical Basis

Model training is the process of learning parameters from data that minimize a loss function. The trained model is a stateful object containing:

Learned parameters -- Weights, biases, coefficients, and other numerical values determined during training.
Model structure -- The architecture of the model (e.g., number of trees, depth, regularization).
Preprocessing state -- Any fitted transformers or scalers that are part of the pipeline.

These components must all be faithfully preserved during conversion to ONNX format. The ONNX conversion tools (such as skl2onnx) inspect the trained model's internal state and translate it into an equivalent ONNX computational graph.

The choice of source framework is independent of the inference runtime -- the key benefit of ONNX is enabling training in any supported framework while deploying via a unified, optimized inference engine.

Usage

A scikit-learn model is trained using the standard fit/predict pattern:

from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y)
clr = LogisticRegression()
clr.fit(X_train, y_train)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment