Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Alibaba MNN MNNConvert CLI

From Leeroopedia


Field Value
Implementation Name MNNConvert_CLI
Type API Doc
Category Model_Conversion_Pipeline
Source tools/converter/source/MNNConverter.cpp:L11-22 (entry point), tools/converter/source/common/cli.cpp:L142-565 (CLI parsing)
External Dependencies protobuf, flatbuffers (bundled), optionally libtorch

Summary

MNNConvert is the primary command-line tool for converting deep learning models from various framework formats into MNN's optimized .mnn format. It supports ONNX, TensorFlow, Caffe, TFLite, TorchScript (optional), and MNN-to-MNN re-optimization.

API

MNNConvert -f {ONNX,TF,CAFFE,TFLITE,TORCH,MNN,JSON} --modelFile <source> --MNNModel <dest.mnn> [options]

Entry Point

The converter entry point is in tools/converter/source/MNNConverter.cpp:

// tools/converter/source/MNNConverter.cpp:L11-22
int main(int argc, char *argv[]) {
    modelConfig modelPath;

    // parser command line arg
    auto res = MNN::Cli::initializeMNNConvertArgs(modelPath, argc, argv);
    if (!res) {
        return 0;
    }
    // Convert
    MNN::Cli::convertModel(modelPath);
    return 0;
}

The core API is exposed through the MNN::Cli class defined in tools/converter/include/cli.hpp:

// tools/converter/include/cli.hpp
namespace MNN {
class MNN_PUBLIC Cli {
public:
    static bool initializeMNNConvertArgs(modelConfig &modelPath, int argc, char **argv);
    static bool convertModel(modelConfig& modelPath);
    static int testconvert(const std::string& defaultCacheFile, const std::string& directName,
                           float maxErrorRate, const std::string& configJson);
    static bool mnn2json(const char* modelFile, const char* jsonFile, int flag = 3);
    static bool json2mnn(const char* jsonFile, const char* modelFile);
};
};

Key Parameters

Parameter Type Default Description
-f / --framework string (required) Source model framework: TF, CAFFE, ONNX, TFLITE, MNN, TORCH, JSON
--modelFile string (required) Path to the source model file
--MNNModel string (required) Output path for the converted .mnn model
--prototxt string (Caffe only) Path to .prototxt file (required for Caffe models)
--optimizeLevel int 1 Graph optimization level: 0 (none, MNN source only), 1 (safe optimizations), 2 (aggressive)
--optimizePrefer int 0 Optimization preference: 0 (normal), 1 (smallest), 2 (fastest)
--fp16 flag false Store Conv weights/biases in half-float (float16) to reduce model size
--weightQuantBits int 0 Quantize conv/matmul/LSTM weights to N-bit integers (2-8). 0 means no quantization
--weightQuantAsymmetric bool false Use asymmetric weight quantization (better accuracy, requires newer MNN runtime)
--weightQuantBlock int -1 Block size for block-wise weight quantization. -1 means channel-wise
--hqq flag false Use Half-Quadratic Quantization method (requires weightQuantAsymmetric=true)
--transformerFuse bool false Fuse key transformer operations (attention, etc.)
--keepInputFormat bool true Preserve input tensor dimension format
--batch int (unset) Set batch size for inputs with unspecified batch dimension
--bizCode string "MNNTest" Model flag/identifier embedded in the MNN model
--forTraining bool false Preserve training ops (BN, Dropout) in the converted model
--saveStaticModel bool false Save as a static model with fixed shapes
--saveExternalData bool false Save weights to a separate external binary file (.mnn.weight)
--compressionParamsFile string (unset) Path to compression parameters file for INT8 quantization
--targetVersion float (current) Target MNN version for backward compatibility
--customOpLibs string (unset) Semicolon-separated list of custom operator shared libraries
--convertMatmulToConv int 1 Convert MatMul with constant input to convolution (0 or 1)
--useGeluApproximation int (unset) Use approximate GELU instead of ERF-based GELU
--groupConvNative bool false Keep native group convolution (do not decompose)
--allowCustomOp bool false Allow custom/unknown operators during conversion
--useOriginRNNImpl bool false Use original LSTM/GRU ops instead of While-module implementation
--detectSparseSpeedUp flag false Detect weight sparsity and enable sparse speedup
--splitBlockQuant flag false Split block-quantized convolutions
--alignDenormalizedValue int 1 x| < 1.18e-38) to zero
--info flag false Dump MNN model metadata (for MNN-format input only)
--JsonFile string (unset) Export MNN model to JSON file (for MNN-format input)
--testdir string (unset) Test directory for post-conversion verification
--thredhold float 0.01 Maximum error threshold for test verification
--testconfig string (unset) JSON config file for test backend settings
--dumpPass flag false Verbose output for each optimization pass
--OP flag false Print all supported operators for the specified framework

Inputs

  • Source model file in one of the supported formats:
    • ONNX: .onnx
    • TensorFlow: .pb (frozen graph)
    • Caffe: .caffemodel + .prototxt
    • TFLite: .tflite
    • TorchScript: .pt / .torchscript (requires MNN_BUILD_TORCH)
    • MNN: .mnn (for re-optimization or JSON export)
    • JSON: .json (MNN JSON representation)

Outputs

  • Primary output -- MNN model file (.mnn) in FlatBuffers binary format
  • Optional -- External weight file (.mnn.weight) when --saveExternalData is used
  • Optional -- JSON structure file when --JsonFile is specified

Usage Examples

Convert ONNX Model

./MNNConvert -f ONNX \
    --modelFile model.onnx \
    --MNNModel model.mnn \
    --bizCode MNN

Convert TensorFlow Model

./MNNConvert -f TF \
    --modelFile frozen_graph.pb \
    --MNNModel model.mnn \
    --bizCode MNN

Convert Caffe Model

./MNNConvert -f CAFFE \
    --modelFile model.caffemodel \
    --prototxt deploy.prototxt \
    --MNNModel model.mnn \
    --bizCode MNN

Convert with FP16 Weights and Transformer Fusion

./MNNConvert -f ONNX \
    --modelFile transformer_model.onnx \
    --MNNModel model.mnn \
    --fp16 \
    --transformerFuse=true \
    --bizCode MNN

Convert with Weight Quantization

./MNNConvert -f ONNX \
    --modelFile model.onnx \
    --MNNModel model_quant.mnn \
    --weightQuantBits 8 \
    --weightQuantAsymmetric=true \
    --bizCode MNN

Convert and Verify Against Test Data

./MNNConvert -f ONNX \
    --modelFile model.onnx \
    --MNNModel model.mnn \
    --testdir ./test_data/ \
    --thredhold 0.01 \
    --bizCode MNN

List Supported Operators

./MNNConvert -f ONNX --modelFile model.onnx --OP=true

Internal Conversion Flow

The conversion flow in Cli::convertModel() (from cli.cpp:L673-790) follows these steps:

  1. Parse source model -- Call the appropriate framework-specific parser (e.g., onnx2MNNNet, tensorflow2MNNNet) to populate an MNN::NetT structure
  2. Set batch size -- If --batch is specified, update Input ops with unspecified batch dimensions
  3. Run graph optimization -- Call optimizeNet() with the configured optimize level and passes
  4. Compute unary buffers -- For quantized models, pre-compute lookup tables for unary operations
  5. Reorder inputs -- Ensure input operator ordering matches the original model
  6. Serialize to FlatBuffers -- Call writeFb() to produce the final .mnn file
  7. Post-conversion test (optional) -- If --testdir is specified, run Cli::testconvert() to verify correctness

Supported Frameworks

The framework identifier (passed via -f) maps to internal enum values in cli.cpp:L360-386:

Framework Flag Internal Type Parser Function
ONNX modelConfig::ONNX onnx2MNNNet()
TF modelConfig::TENSORFLOW tensorflow2MNNNet()
CAFFE modelConfig::CAFFE caffe2MNNNet()
TFLITE modelConfig::TFLITE tflite2MNNNet()
TORCH modelConfig::TORCH torch2MNNNet() (requires MNN_BUILD_TORCH)
MNN modelConfig::MNN addBizCode() or mnn2json()
JSON modelConfig::JSON json2mnn()

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment