Workflow:Alibaba MNN Model Conversion Pipeline

Knowledge Sources	Alibaba MNN MNN Docs MNN Converter Guide
Domains	Deep_Learning, Model_Deployment, Model_Conversion
Last Updated	2026-02-10 08:00 GMT

Overview

End-to-end process for converting deep learning models from external frameworks (ONNX, TensorFlow, Caffe, TorchScript, TFLite) into the MNN format for efficient on-device inference.

Description

This workflow covers the standard procedure for transforming models trained in popular frameworks into MNN's proprietary format (.mnn). The MNN model format uses FlatBuffers serialization for fast, zero-copy loading. The conversion pipeline handles graph optimization, operator mapping, optional weight quantization, and correctness verification. It supports models from TensorFlow (frozen .pb), TensorFlow Lite (.tflite), Caffe (.caffemodel), ONNX (.onnx), and TorchScript (.pt) formats.

Key outputs:

A .mnn model file ready for on-device inference
Optionally, a separated .mnn.weight file for reduced peak memory
Optional weight quantization (2-8 bit) for model size reduction

Usage

Execute this workflow when you have a trained deep learning model in TensorFlow, PyTorch/ONNX, Caffe, or TFLite format and need to deploy it on mobile devices (iOS/Android), embedded systems, or edge hardware using the MNN inference engine. This is the prerequisite step before any MNN inference workflow.

Execution Steps

Step 1: Prepare the source model

Ensure the source model is in a supported format. For PyTorch models, export to either ONNX (using torch.onnx.export) or TorchScript (using torch.jit.trace or torch.jit.script). For TensorFlow models, ensure the model is a frozen graph (.pb), not a SavedModel directory. For models with dynamic input shapes, specify dynamic_axes during ONNX export.

Key considerations:

PyTorch models must be exported via torch.jit or torch.onnx, not raw .pth weight files
TorchScript export requires the model to be in eval mode
If input dimensions vary at runtime, mark them as dynamic during ONNX export

Step 2: Install or compile MNNConvert

Obtain the MNNConvert tool either by installing the PyMNN Python package (pip install MNN, which provides the mnnconvert CLI) or by compiling from source with the CMake option -DMNN_BUILD_CONVERTER=ON. The Python package is recommended for quick experimentation; the compiled binary is preferred for production.

Key considerations:

Python route: pip install MNN provides mnnconvert command
Source route: cmake .. -DMNN_BUILD_CONVERTER=ON && make -j8 produces MNNConvert binary
Ensure the tool version matches the target MNN runtime version

Step 3: Execute model conversion

Run MNNConvert specifying the source framework (-f), the input model file (--modelFile), and the output MNN model path (--MNNModel). Each framework has specific flags: Caffe also requires --prototxt for the model definition. Optionally apply graph optimization level (--optimizeLevel), weight quantization (--weightQuantBits), FP16 storage (--fp16), or external weight separation (--saveExternalData).

What happens:

The converter parses the source model's graph and operators
Operators are mapped to MNN's internal representation using FlatBuffers schema
Graph optimization passes simplify the computation graph (fusing BatchNorm, removing Identity nodes, etc.)
If weight quantization is enabled, float32 weights are compressed to the specified bit width

Step 4: Verify conversion correctness

Run the provided verification scripts (testMNNFromOnnx.py, testMNNFromTf.py, testMNNFromTflite.py, or testMNNFromTorch.py) to compare MNN inference results against the original framework's outputs. The scripts generate random inputs, run inference in both the original framework and MNN, and compare outputs within a configurable threshold (default 0.01).

Key considerations:

Requires the original framework's runtime installed (e.g., onnxruntime for ONNX)
If TEST_SUCCESS is reported, the conversion is correct
For errors, use the DEBUG mode to perform binary search and identify the problematic layer
Identity op removal during optimization may require specifying an alternative output layer name

Step 5: Inspect and validate the MNN model

Use MNNConvert with --info flag to print model metadata (input/output names, shapes, data types, version). Optionally convert to JSON (--JsonFile) for human-readable inspection of the full model structure including operator parameters and weight statistics.

Key considerations:

Verify input/output names and shapes match expectations
JSON export allows manual editing and re-conversion back to MNN format
Check dimensionFormat (NCHW vs NHWC) matches your intended input pipeline

Execution Diagram

GitHub URL

Workflow Repository