Principle:Microsoft Onnxruntime Result Processing

Metadata

Field	Value
Principle Name	Result_Processing
Repository	Microsoft_Onnxruntime
Source Repository	https://github.com/microsoft/onnxruntime
Domain	ML_Inference, Model_Optimization
Last Updated	2026-02-10
Workflow	Python_Inference_Pipeline
Pair	6 of 6

Overview

Extraction and interpretation of inference output tensors for downstream use.

Description

After inference execution, results are returned as a list of numpy arrays. Each array corresponds to one of the requested outputs and can be indexed, sliced, or transformed using standard numpy operations for downstream tasks like classification, regression, or visualization. This is a Pattern Doc describing common patterns for handling ONNX Runtime output.

The return value of session.run() is a Python list where:

The list index corresponds to the position of the output name in the output_names argument.
Each element is a numpy.ndarray with shape and dtype matching the model's declared output.

Common post-processing patterns include:

Classification -- Using numpy.argmax() to extract predicted class labels from probability distributions.
Regression -- Direct use of the output array as predicted values.
Probability extraction -- Accessing probability vectors for confidence scoring and ROC analysis.
Batch processing -- Slicing output arrays along the batch dimension to process individual samples.

The pattern is demonstrated at docs/python/examples/plot_load_and_predict.py:L55.

Theoretical Basis

ONNX Runtime returns outputs as numpy arrays to maintain compatibility with the broader Python scientific computing ecosystem. This design choice enables seamless integration with downstream libraries such as scikit-learn (for metrics), matplotlib (for visualization), and pandas (for tabular analysis).

The output list ordering is deterministic and matches the order of requested output names. When None is passed for output names, outputs are returned in the order they appear in the model's graph definition.

Post-processing operations are performed entirely in Python/numpy space and do not involve the ONNX Runtime execution engine. This clean separation between inference and post-processing allows for flexible output handling without runtime overhead.

Usage

Results are accessed by list indexing and processed with numpy operations:

results = sess.run([output_name], {input_name: x})
predictions = results[0]  # First output tensor
# For classification:
predicted_class = numpy.argmax(predictions, axis=-1)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment