Principle:Microsoft Onnxruntime Conversion Validation

Metadata

Field	Value
Principle Name	Conversion_Validation
Repository	Microsoft_Onnxruntime
Source Repository	https://github.com/microsoft/onnxruntime
Domain	ML_Inference, Model_Conversion
Last Updated	2026-02-10
Workflow	Train_Convert_Predict
Pair	4 of 5

Overview

Verification that ONNX-converted model predictions match the original source framework predictions.

Description

After converting a model to ONNX format, it is critical to validate that the converted model produces identical or near-identical predictions compared to the original. This is done by running both models on the same test data and comparing outputs.

The validation process follows these steps:

Load the converted ONNX model into an InferenceSession.
Run inference on the same test data used to validate the original model.
Compare the ONNX Runtime predictions with the original scikit-learn predictions.
Use a confusion matrix or numerical comparison to verify equivalence.

The validation pattern is demonstrated at docs/python/examples/plot_train_convert_predict.py:L66-80. In this example, the confusion matrix between the original scikit-learn predictions and the ONNX Runtime predictions should be a perfect diagonal matrix (identity), indicating that every prediction matches.

Theoretical Basis

Conversion validation is essential because the translation from one model format to another involves multiple potential sources of discrepancy:

Numerical precision -- Differences in floating-point arithmetic between frameworks can cause small deviations, especially for models sensitive to rounding.
Operator semantics -- Subtle differences in how operators handle edge cases (e.g., tie-breaking in argmax) can lead to different predictions.
Graph transformations -- Optimizations applied during conversion may alter the computation order, affecting floating-point accumulation.

For classification models, a confusion matrix comparison is the standard validation approach. A perfect diagonal matrix confirms that every sample receives the same predicted label from both the original and converted model.

For regression models or probability outputs, numerical comparison with a tolerance threshold (e.g., numpy.allclose()) is more appropriate, as small floating-point differences are expected and acceptable.

Usage

Validation compares predictions from both the original model and the ONNX-converted version:

import onnxruntime as rt
import numpy
from sklearn.metrics import confusion_matrix

sess = rt.InferenceSession("logreg_iris.onnx", providers=rt.get_available_providers())
input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name
pred_onx = sess.run([label_name], {input_name: X_test.astype(numpy.float32)})[0]
print(confusion_matrix(pred, pred_onx))

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment