Implementation:Alibaba MNN MNNConvert CLI

Field	Value
Implementation Name	MNNConvert_CLI
Type	API Doc
Category	Model_Conversion_Pipeline
Source	`tools/converter/source/MNNConverter.cpp:L11-22` (entry point), `tools/converter/source/common/cli.cpp:L142-565` (CLI parsing)
External Dependencies	protobuf, flatbuffers (bundled), optionally libtorch

Summary

MNNConvert is the primary command-line tool for converting deep learning models from various framework formats into MNN's optimized .mnn format. It supports ONNX, TensorFlow, Caffe, TFLite, TorchScript (optional), and MNN-to-MNN re-optimization.

API

MNNConvert -f {ONNX,TF,CAFFE,TFLITE,TORCH,MNN,JSON} --modelFile <source> --MNNModel <dest.mnn> [options]

Entry Point

The converter entry point is in tools/converter/source/MNNConverter.cpp:

// tools/converter/source/MNNConverter.cpp:L11-22
int main(int argc, char *argv[]) {
    modelConfig modelPath;

    // parser command line arg
    auto res = MNN::Cli::initializeMNNConvertArgs(modelPath, argc, argv);
    if (!res) {
        return 0;
    }
    // Convert
    MNN::Cli::convertModel(modelPath);
    return 0;
}

The core API is exposed through the MNN::Cli class defined in tools/converter/include/cli.hpp:

// tools/converter/include/cli.hpp
namespace MNN {
class MNN_PUBLIC Cli {
public:
    static bool initializeMNNConvertArgs(modelConfig &modelPath, int argc, char **argv);
    static bool convertModel(modelConfig& modelPath);
    static int testconvert(const std::string& defaultCacheFile, const std::string& directName,
                           float maxErrorRate, const std::string& configJson);
    static bool mnn2json(const char* modelFile, const char* jsonFile, int flag = 3);
    static bool json2mnn(const char* jsonFile, const char* modelFile);
};
};

Key Parameters

Parameter	Type	Default	Description
`-f` / `--framework`	string	(required)	Source model framework: `TF`, `CAFFE`, `ONNX`, `TFLITE`, `MNN`, `TORCH`, `JSON`
`--modelFile`	string	(required)	Path to the source model file
`--MNNModel`	string	(required)	Output path for the converted `.mnn` model
`--prototxt`	string	(Caffe only)	Path to `.prototxt` file (required for Caffe models)
`--optimizeLevel`	int	1	Graph optimization level: 0 (none, MNN source only), 1 (safe optimizations), 2 (aggressive)
`--optimizePrefer`	int	0	Optimization preference: 0 (normal), 1 (smallest), 2 (fastest)
`--fp16`	flag	false	Store Conv weights/biases in half-float (float16) to reduce model size
`--weightQuantBits`	int	0	Quantize conv/matmul/LSTM weights to N-bit integers (2-8). 0 means no quantization
`--weightQuantAsymmetric`	bool	false	Use asymmetric weight quantization (better accuracy, requires newer MNN runtime)
`--weightQuantBlock`	int	-1	Block size for block-wise weight quantization. -1 means channel-wise
`--hqq`	flag	false	Use Half-Quadratic Quantization method (requires `weightQuantAsymmetric=true`)
`--transformerFuse`	bool	false	Fuse key transformer operations (attention, etc.)
`--keepInputFormat`	bool	true	Preserve input tensor dimension format
`--batch`	int	(unset)	Set batch size for inputs with unspecified batch dimension
`--bizCode`	string	"MNNTest"	Model flag/identifier embedded in the MNN model
`--forTraining`	bool	false	Preserve training ops (BN, Dropout) in the converted model
`--saveStaticModel`	bool	false	Save as a static model with fixed shapes
`--saveExternalData`	bool	false	Save weights to a separate external binary file (`.mnn.weight`)
`--compressionParamsFile`	string	(unset)	Path to compression parameters file for INT8 quantization
`--targetVersion`	float	(current)	Target MNN version for backward compatibility
`--customOpLibs`	string	(unset)	Semicolon-separated list of custom operator shared libraries
`--convertMatmulToConv`	int	1	Convert MatMul with constant input to convolution (0 or 1)
`--useGeluApproximation`	int	(unset)	Use approximate GELU instead of ERF-based GELU
`--groupConvNative`	bool	false	Keep native group convolution (do not decompose)
`--allowCustomOp`	bool	false	Allow custom/unknown operators during conversion
`--useOriginRNNImpl`	bool	false	Use original LSTM/GRU ops instead of While-module implementation
`--detectSparseSpeedUp`	flag	false	Detect weight sparsity and enable sparse speedup
`--splitBlockQuant`	flag	false	Split block-quantized convolutions
`--alignDenormalizedValue`	int	1	x\| < 1.18e-38) to zero
`--info`	flag	false	Dump MNN model metadata (for MNN-format input only)
`--JsonFile`	string	(unset)	Export MNN model to JSON file (for MNN-format input)
`--testdir`	string	(unset)	Test directory for post-conversion verification
`--thredhold`	float	0.01	Maximum error threshold for test verification
`--testconfig`	string	(unset)	JSON config file for test backend settings
`--dumpPass`	flag	false	Verbose output for each optimization pass
`--OP`	flag	false	Print all supported operators for the specified framework

Inputs

Source model file in one of the supported formats:
- ONNX: .onnx
- TensorFlow: .pb (frozen graph)
- Caffe: .caffemodel + .prototxt
- TFLite: .tflite
- TorchScript: .pt / .torchscript (requires MNN_BUILD_TORCH)
- MNN: .mnn (for re-optimization or JSON export)
- JSON: .json (MNN JSON representation)

Outputs

Primary output -- MNN model file (.mnn) in FlatBuffers binary format
Optional -- External weight file (.mnn.weight) when --saveExternalData is used
Optional -- JSON structure file when --JsonFile is specified

Usage Examples

Convert ONNX Model

./MNNConvert -f ONNX \
    --modelFile model.onnx \
    --MNNModel model.mnn \
    --bizCode MNN

Convert TensorFlow Model

./MNNConvert -f TF \
    --modelFile frozen_graph.pb \
    --MNNModel model.mnn \
    --bizCode MNN

Convert Caffe Model

./MNNConvert -f CAFFE \
    --modelFile model.caffemodel \
    --prototxt deploy.prototxt \
    --MNNModel model.mnn \
    --bizCode MNN

Convert with FP16 Weights and Transformer Fusion

./MNNConvert -f ONNX \
    --modelFile transformer_model.onnx \
    --MNNModel model.mnn \
    --fp16 \
    --transformerFuse=true \
    --bizCode MNN

Convert with Weight Quantization

./MNNConvert -f ONNX \
    --modelFile model.onnx \
    --MNNModel model_quant.mnn \
    --weightQuantBits 8 \
    --weightQuantAsymmetric=true \
    --bizCode MNN

Convert and Verify Against Test Data

./MNNConvert -f ONNX \
    --modelFile model.onnx \
    --MNNModel model.mnn \
    --testdir ./test_data/ \
    --thredhold 0.01 \
    --bizCode MNN

List Supported Operators

./MNNConvert -f ONNX --modelFile model.onnx --OP=true

Internal Conversion Flow

The conversion flow in Cli::convertModel() (from cli.cpp:L673-790) follows these steps:

Parse source model -- Call the appropriate framework-specific parser (e.g., onnx2MNNNet, tensorflow2MNNNet) to populate an MNN::NetT structure
Set batch size -- If --batch is specified, update Input ops with unspecified batch dimensions
Run graph optimization -- Call optimizeNet() with the configured optimize level and passes
Compute unary buffers -- For quantized models, pre-compute lookup tables for unary operations
Reorder inputs -- Ensure input operator ordering matches the original model
Serialize to FlatBuffers -- Call writeFb() to produce the final .mnn file
Post-conversion test (optional) -- If --testdir is specified, run Cli::testconvert() to verify correctness

Supported Frameworks

The framework identifier (passed via -f) maps to internal enum values in cli.cpp:L360-386:

Framework Flag	Internal Type	Parser Function
`ONNX`	`modelConfig::ONNX`	`onnx2MNNNet()`
`TF`	`modelConfig::TENSORFLOW`	`tensorflow2MNNNet()`
`CAFFE`	`modelConfig::CAFFE`	`caffe2MNNNet()`
`TFLITE`	`modelConfig::TFLITE`	`tflite2MNNNet()`
`TORCH`	`modelConfig::TORCH`	`torch2MNNNet()` (requires MNN_BUILD_TORCH)
`MNN`	`modelConfig::MNN`	`addBizCode()` or `mnn2json()`
`JSON`	`modelConfig::JSON`	`json2mnn()`

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment