Principle:Kornia Kornia ONNX Sequential Pipeline
| Knowledge Sources | |
|---|---|
| Domains | ONNX, Deployment, Pipeline_Design |
| Last Updated | 2026-02-09 15:00 GMT |
Overview
Technique of composing multiple ONNX models into a single sequential inference pipeline with automatic graph merging.
Description
ONNX model composition combines multiple standalone models into a single inference pipeline. The technique involves:
- Loading individual ONNX models from files, URLs, or HuggingFace Hub.
- Defining input/output mappings between consecutive models.
- Merging computation graphs into a single combined graph.
- Creating a single inference session for the merged graph.
This approach enables preprocessing, model inference, and postprocessing to run as one optimized pipeline. Graph merging allows ONNX Runtime to apply cross-model optimizations that would not be possible with separate sessions.
Usage
Use when deploying multi-stage inference pipelines (e.g., resize -> normalize -> model -> postprocess) where models are available as ONNX files. Provides single-session efficiency and enables combined export.
Theoretical Basis
Given models M1, M2, ..., Mn with io_maps defining connections, the combined graph:
G = merge(M1, M2, ..., Mn, io_maps)
A single ONNX Runtime InferenceSession is created for G, enabling:
- Single allocation — one memory plan for all models.
- Cross-model optimization — operator fusion across model boundaries.
- Single invocation — one call executes the entire pipeline.