Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Workflow:Dotnet Machinelearning ONNX Model Scoring

From Leeroopedia
Revision as of 11:04, 16 February 2026 by Admin (talk | contribs) (Auto-imported from workflows/Dotnet_Machinelearning_ONNX_Model_Scoring.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Machine_Learning, Model_Interop, Inference
Last Updated 2026-02-09 12:00 GMT

Overview

End-to-end process for importing and running pre-trained ONNX models within ML.NET pipelines to score data using models trained in any framework (PyTorch, TensorFlow, scikit-learn).

Description

This workflow outlines the procedure for consuming pre-trained models in ONNX (Open Neural Network Exchange) format within ML.NET applications. ONNX is a cross-platform model interchange format supported by most major ML frameworks. By using the Microsoft.ML.OnnxTransformer package, developers can import models trained in Python frameworks (PyTorch, TensorFlow, scikit-learn, XGBoost) and run inference on them within a .NET pipeline. This enables teams to leverage state-of-the-art models (BERT, ResNet, YOLO, custom DNNs) developed in the Python ecosystem while deploying them in production .NET services. The ONNX scorer integrates seamlessly with ML.NET's IDataView pipeline, allowing pre- and post-processing transforms to be chained with the model.

Usage

Execute this workflow when you have a pre-trained model in ONNX format and want to run inference on it from a .NET application. This is common when the model was developed by a data science team using Python/PyTorch/TensorFlow and needs to be deployed in a .NET production service, or when leveraging pre-trained models from ONNX Model Zoo for tasks like image classification, object detection, NLP, or time series analysis.

Execution Steps

Step 1: Obtain ONNX Model

Acquire the ONNX model file (.onnx) from one of three sources: export from a Python framework (torch.onnx.export, tf2onnx, sklearn-onnx), download from the ONNX Model Zoo, or convert from another format using the ONNX converter tools. Verify the model's input and output tensor specifications.

Key considerations:

  • Verify ONNX opset version compatibility with the ONNX Runtime version used by ML.NET
  • Document the model's expected input tensor shapes, data types, and names
  • Document the model's output tensor shapes, data types, and names
  • Test the ONNX model independently (e.g., with ONNX Runtime Python) before integrating into ML.NET
  • Some models may require specific preprocessing (normalization, resizing) that must be replicated in the ML.NET pipeline

Step 2: Initialize MLContext and Load Data

Create an MLContext and load the input data into an IDataView. The data schema should contain columns that will be transformed to match the ONNX model's expected input tensor format. For image models, this typically means loading image file paths; for tabular models, loading numeric features.

Key considerations:

  • Column types in the IDataView must be convertible to the ONNX model's expected input types
  • For image inputs, use ML.NET's image loading and resizing transforms to prepare pixel data
  • For text inputs, tokenization may be needed to produce integer token ID arrays
  • Ensure column names match the ONNX model's input node names, or use the column mapping parameter

Step 3: Build Preprocessing Pipeline

Construct a transform pipeline that converts raw input data into the tensor format expected by the ONNX model. This may include image resizing and pixel extraction, numeric normalization, type conversion, or reshaping columns into the correct tensor dimensions.

Key considerations:

  • Image models typically require ResizeImages, ExtractPixels transforms to produce float arrays
  • The pixel extraction order (RGB vs BGR, interleave vs planar) must match what the model was trained on
  • Numeric features may need normalization matching the training preprocessing (e.g., ImageNet mean/std)
  • Column shapes must match the ONNX model's input tensor shapes exactly

Step 4: Apply ONNX Model Scorer

Add the ApplyOnnxModel transform to the pipeline, specifying the path to the .onnx file and the input/output column mappings. This transform loads the ONNX model via ONNX Runtime and executes inference as part of the ML.NET pipeline.

Key considerations:

  • Specify inputColumnNames mapping IDataView columns to ONNX model input nodes
  • Specify outputColumnNames to capture ONNX model output nodes into IDataView columns
  • GPU execution can be enabled by specifying the GPU device ID parameter
  • The ONNX model is loaded once and reused across all data rows for efficiency
  • Fallback columns can preserve original data alongside model outputs

Step 5: Build Postprocessing Pipeline

Add transforms after the ONNX scorer to interpret the model's raw output tensors. This may include converting logits to probabilities (softmax), mapping class indices to label strings, extracting bounding boxes for detection models, or applying thresholds.

Key considerations:

  • Classification models typically output raw logits that need softmax normalization
  • Detection models output bounding boxes and class scores that need non-maximum suppression
  • Regression models output scalar predictions that may need denormalization
  • Custom postprocessing can be implemented via CustomMapping transforms

Step 6: Evaluate and Deploy

Score the data through the complete pipeline, evaluate results against ground truth if available, and deploy the pipeline for production inference. The complete pipeline (preprocessing + ONNX scorer + postprocessing) can be saved as a single ML.NET model.

Key considerations:

  • The saved pipeline includes preprocessing and postprocessing but references the ONNX model file externally
  • Ensure the ONNX model file is deployed alongside the ML.NET model
  • PredictionEngine wraps the full pipeline for single-row real-time inference
  • Batch scoring via Transform() is more efficient for large datasets
  • Monitor inference latency; ONNX Runtime GPU execution significantly reduces latency for deep models

Execution Diagram

GitHub URL

Workflow Repository