Principle:Alibaba MNN Source Model Preparation
| Field | Value |
|---|---|
| Principle Name | Source_Model_Preparation |
| Category | Model_Conversion_Pipeline |
| Description | Preparing trained models for conversion to on-device formats |
| Applies To | Pre-conversion stage of the MNN deployment workflow |
Overview
Before a trained deep learning model can be deployed on mobile or edge devices through MNN, it must first be serialized from its in-memory representation into a portable file format. This step, known as model serialization or model export, transforms a live computational graph with associated weights into a self-contained artifact that downstream tools can parse, optimize, and convert.
Why Model Serialization Is Needed
During training, a model exists as an in-memory data structure tightly coupled to the training framework's runtime (e.g., PyTorch's autograd graph, TensorFlow's eager tensors). This representation is unsuitable for deployment for several reasons:
- Runtime dependency -- The in-memory graph requires the full training framework to execute, which is far too large for mobile devices.
- Dynamic vs. static semantics -- Training frameworks allow dynamic control flow and Python-level logic that must be resolved or traced before conversion.
- Weight consolidation -- Model parameters may be distributed across multiple GPU memories, gradient buffers, and optimizer states. Serialization extracts only the inference-relevant weights.
- Reproducibility -- A serialized file is a frozen snapshot of the model at a specific point in time, guaranteeing deterministic behavior across environments.
Serialization Approaches
TorchScript Tracing
Tracing executes the model with a representative input and records the operations performed. The result is a ScriptModule that captures the exact sequence of operations for that particular input path. Tracing works well for models without data-dependent control flow.
TorchScript Scripting
Scripting analyzes the Python source code of the model and compiles it to TorchScript IR. Unlike tracing, scripting preserves control flow (if/else, loops) but requires all code to be compatible with the TorchScript subset of Python.
ONNX Export
The Open Neural Network Exchange (ONNX) format provides a framework-agnostic intermediate representation. Models are exported by tracing execution against a given input and mapping framework-specific operations to standardized ONNX operators at a specified opset version. ONNX is one of the most commonly used input formats for MNNConvert.
TensorFlow SavedModel / Frozen Graph
TensorFlow models are serialized as Protocol Buffer (.pb) files containing the graph definition and variable values. A "frozen" graph has all variables converted to constants, making it self-contained.
Caffe Model Files
Caffe uses a pair of files: a .prototxt file defining the network architecture and a .caffemodel file containing the learned weights, both serialized via Protocol Buffers.
TFLite FlatBuffers
TensorFlow Lite models (.tflite) use the FlatBuffers format for efficient zero-copy deserialization on mobile devices. These can be produced from TensorFlow SavedModels via the TFLite converter.
Key Considerations
- Eval mode -- Models must be switched to evaluation mode before export (e.g.,
model.eval()in PyTorch) to disable dropout, batch normalization running stats, and other training-only behavior. - Representative input -- For tracing-based approaches, the example input must have the correct shape, dtype, and value range to exercise the intended code path.
- Opset compatibility -- When exporting to ONNX, the opset version determines which operators are available. MNN's ONNX converter supports a broad range of opsets; opset 13 is a commonly used default.
- Dynamic shapes -- If the model should support variable batch sizes or sequence lengths, dynamic axes must be explicitly declared during export.
Supported Input Formats for MNN
MNN's converter (MNNConvert) accepts the following serialized formats:
- ONNX --
.onnxfiles (most widely used path) - TensorFlow --
.pbfrozen graph files - Caffe --
.caffemodel+.prototxtfile pairs - TFLite --
.tfliteFlatBuffer files - TorchScript --
.pt/.torchscriptfiles (requires building MNN withMNN_BUILD_TORCH=ON) - MNN --
.mnnfiles (for re-optimization or format conversion)