Principle:Microsoft Onnxruntime Conversion Validation
Metadata
| Field | Value |
|---|---|
| Principle Name | Conversion_Validation |
| Repository | Microsoft_Onnxruntime |
| Source Repository | https://github.com/microsoft/onnxruntime |
| Domain | ML_Inference, Model_Conversion |
| Last Updated | 2026-02-10 |
| Workflow | Train_Convert_Predict |
| Pair | 4 of 5 |
Overview
Verification that ONNX-converted model predictions match the original source framework predictions.
Description
After converting a model to ONNX format, it is critical to validate that the converted model produces identical or near-identical predictions compared to the original. This is done by running both models on the same test data and comparing outputs.
The validation process follows these steps:
- Load the converted ONNX model into an InferenceSession.
- Run inference on the same test data used to validate the original model.
- Compare the ONNX Runtime predictions with the original scikit-learn predictions.
- Use a confusion matrix or numerical comparison to verify equivalence.
The validation pattern is demonstrated at docs/python/examples/plot_train_convert_predict.py:L66-80. In this example, the confusion matrix between the original scikit-learn predictions and the ONNX Runtime predictions should be a perfect diagonal matrix (identity), indicating that every prediction matches.
Theoretical Basis
Conversion validation is essential because the translation from one model format to another involves multiple potential sources of discrepancy:
- Numerical precision -- Differences in floating-point arithmetic between frameworks can cause small deviations, especially for models sensitive to rounding.
- Operator semantics -- Subtle differences in how operators handle edge cases (e.g., tie-breaking in argmax) can lead to different predictions.
- Graph transformations -- Optimizations applied during conversion may alter the computation order, affecting floating-point accumulation.
For classification models, a confusion matrix comparison is the standard validation approach. A perfect diagonal matrix confirms that every sample receives the same predicted label from both the original and converted model.
For regression models or probability outputs, numerical comparison with a tolerance threshold (e.g., numpy.allclose()) is more appropriate, as small floating-point differences are expected and acceptable.
Usage
Validation compares predictions from both the original model and the ONNX-converted version:
import onnxruntime as rt
import numpy
from sklearn.metrics import confusion_matrix
sess = rt.InferenceSession("logreg_iris.onnx", providers=rt.get_available_providers())
input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name
pred_onx = sess.run([label_name], {input_name: X_test.astype(numpy.float32)})[0]
print(confusion_matrix(pred, pred_onx))