Principle:Alibaba MNN Source Model Preparation

Field	Value
Principle Name	Source_Model_Preparation
Category	Model_Conversion_Pipeline
Description	Preparing trained models for conversion to on-device formats
Applies To	Pre-conversion stage of the MNN deployment workflow

Overview

Before a trained deep learning model can be deployed on mobile or edge devices through MNN, it must first be serialized from its in-memory representation into a portable file format. This step, known as model serialization or model export, transforms a live computational graph with associated weights into a self-contained artifact that downstream tools can parse, optimize, and convert.

Why Model Serialization Is Needed

During training, a model exists as an in-memory data structure tightly coupled to the training framework's runtime (e.g., PyTorch's autograd graph, TensorFlow's eager tensors). This representation is unsuitable for deployment for several reasons:

Runtime dependency -- The in-memory graph requires the full training framework to execute, which is far too large for mobile devices.
Dynamic vs. static semantics -- Training frameworks allow dynamic control flow and Python-level logic that must be resolved or traced before conversion.
Weight consolidation -- Model parameters may be distributed across multiple GPU memories, gradient buffers, and optimizer states. Serialization extracts only the inference-relevant weights.
Reproducibility -- A serialized file is a frozen snapshot of the model at a specific point in time, guaranteeing deterministic behavior across environments.

Serialization Approaches

TorchScript Tracing

Tracing executes the model with a representative input and records the operations performed. The result is a ScriptModule that captures the exact sequence of operations for that particular input path. Tracing works well for models without data-dependent control flow.

TorchScript Scripting

Scripting analyzes the Python source code of the model and compiles it to TorchScript IR. Unlike tracing, scripting preserves control flow (if/else, loops) but requires all code to be compatible with the TorchScript subset of Python.

ONNX Export

The Open Neural Network Exchange (ONNX) format provides a framework-agnostic intermediate representation. Models are exported by tracing execution against a given input and mapping framework-specific operations to standardized ONNX operators at a specified opset version. ONNX is one of the most commonly used input formats for MNNConvert.

TensorFlow SavedModel / Frozen Graph

TensorFlow models are serialized as Protocol Buffer (.pb) files containing the graph definition and variable values. A "frozen" graph has all variables converted to constants, making it self-contained.

Caffe Model Files

Caffe uses a pair of files: a .prototxt file defining the network architecture and a .caffemodel file containing the learned weights, both serialized via Protocol Buffers.

TFLite FlatBuffers

TensorFlow Lite models (.tflite) use the FlatBuffers format for efficient zero-copy deserialization on mobile devices. These can be produced from TensorFlow SavedModels via the TFLite converter.

Key Considerations

Eval mode -- Models must be switched to evaluation mode before export (e.g., model.eval() in PyTorch) to disable dropout, batch normalization running stats, and other training-only behavior.
Representative input -- For tracing-based approaches, the example input must have the correct shape, dtype, and value range to exercise the intended code path.
Opset compatibility -- When exporting to ONNX, the opset version determines which operators are available. MNN's ONNX converter supports a broad range of opsets; opset 13 is a commonly used default.
Dynamic shapes -- If the model should support variable batch sizes or sequence lengths, dynamic axes must be explicitly declared during export.

Supported Input Formats for MNN

MNN's converter (MNNConvert) accepts the following serialized formats:

ONNX -- .onnx files (most widely used path)
TensorFlow -- .pb frozen graph files
Caffe -- .caffemodel + .prototxt file pairs
TFLite -- .tflite FlatBuffer files
TorchScript -- .pt / .torchscript files (requires building MNN with MNN_BUILD_TORCH=ON)
MNN -- .mnn files (for re-optimization or format conversion)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment