Implementation:Alibaba MNN MNNConvert CLI
Appearance
| Field | Value |
|---|---|
| Implementation Name | MNNConvert_CLI |
| Type | API Doc |
| Category | Model_Conversion_Pipeline |
| Source | tools/converter/source/MNNConverter.cpp:L11-22 (entry point), tools/converter/source/common/cli.cpp:L142-565 (CLI parsing)
|
| External Dependencies | protobuf, flatbuffers (bundled), optionally libtorch |
Summary
MNNConvert is the primary command-line tool for converting deep learning models from various framework formats into MNN's optimized .mnn format. It supports ONNX, TensorFlow, Caffe, TFLite, TorchScript (optional), and MNN-to-MNN re-optimization.
API
MNNConvert -f {ONNX,TF,CAFFE,TFLITE,TORCH,MNN,JSON} --modelFile <source> --MNNModel <dest.mnn> [options]
Entry Point
The converter entry point is in tools/converter/source/MNNConverter.cpp:
// tools/converter/source/MNNConverter.cpp:L11-22
int main(int argc, char *argv[]) {
modelConfig modelPath;
// parser command line arg
auto res = MNN::Cli::initializeMNNConvertArgs(modelPath, argc, argv);
if (!res) {
return 0;
}
// Convert
MNN::Cli::convertModel(modelPath);
return 0;
}
The core API is exposed through the MNN::Cli class defined in tools/converter/include/cli.hpp:
// tools/converter/include/cli.hpp
namespace MNN {
class MNN_PUBLIC Cli {
public:
static bool initializeMNNConvertArgs(modelConfig &modelPath, int argc, char **argv);
static bool convertModel(modelConfig& modelPath);
static int testconvert(const std::string& defaultCacheFile, const std::string& directName,
float maxErrorRate, const std::string& configJson);
static bool mnn2json(const char* modelFile, const char* jsonFile, int flag = 3);
static bool json2mnn(const char* jsonFile, const char* modelFile);
};
};
Key Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
-f / --framework |
string | (required) | Source model framework: TF, CAFFE, ONNX, TFLITE, MNN, TORCH, JSON
|
--modelFile |
string | (required) | Path to the source model file |
--MNNModel |
string | (required) | Output path for the converted .mnn model
|
--prototxt |
string | (Caffe only) | Path to .prototxt file (required for Caffe models)
|
--optimizeLevel |
int | 1 | Graph optimization level: 0 (none, MNN source only), 1 (safe optimizations), 2 (aggressive) |
--optimizePrefer |
int | 0 | Optimization preference: 0 (normal), 1 (smallest), 2 (fastest) |
--fp16 |
flag | false | Store Conv weights/biases in half-float (float16) to reduce model size |
--weightQuantBits |
int | 0 | Quantize conv/matmul/LSTM weights to N-bit integers (2-8). 0 means no quantization |
--weightQuantAsymmetric |
bool | false | Use asymmetric weight quantization (better accuracy, requires newer MNN runtime) |
--weightQuantBlock |
int | -1 | Block size for block-wise weight quantization. -1 means channel-wise |
--hqq |
flag | false | Use Half-Quadratic Quantization method (requires weightQuantAsymmetric=true)
|
--transformerFuse |
bool | false | Fuse key transformer operations (attention, etc.) |
--keepInputFormat |
bool | true | Preserve input tensor dimension format |
--batch |
int | (unset) | Set batch size for inputs with unspecified batch dimension |
--bizCode |
string | "MNNTest" | Model flag/identifier embedded in the MNN model |
--forTraining |
bool | false | Preserve training ops (BN, Dropout) in the converted model |
--saveStaticModel |
bool | false | Save as a static model with fixed shapes |
--saveExternalData |
bool | false | Save weights to a separate external binary file (.mnn.weight)
|
--compressionParamsFile |
string | (unset) | Path to compression parameters file for INT8 quantization |
--targetVersion |
float | (current) | Target MNN version for backward compatibility |
--customOpLibs |
string | (unset) | Semicolon-separated list of custom operator shared libraries |
--convertMatmulToConv |
int | 1 | Convert MatMul with constant input to convolution (0 or 1) |
--useGeluApproximation |
int | (unset) | Use approximate GELU instead of ERF-based GELU |
--groupConvNative |
bool | false | Keep native group convolution (do not decompose) |
--allowCustomOp |
bool | false | Allow custom/unknown operators during conversion |
--useOriginRNNImpl |
bool | false | Use original LSTM/GRU ops instead of While-module implementation |
--detectSparseSpeedUp |
flag | false | Detect weight sparsity and enable sparse speedup |
--splitBlockQuant |
flag | false | Split block-quantized convolutions |
--alignDenormalizedValue |
int | 1 | x| < 1.18e-38) to zero |
--info |
flag | false | Dump MNN model metadata (for MNN-format input only) |
--JsonFile |
string | (unset) | Export MNN model to JSON file (for MNN-format input) |
--testdir |
string | (unset) | Test directory for post-conversion verification |
--thredhold |
float | 0.01 | Maximum error threshold for test verification |
--testconfig |
string | (unset) | JSON config file for test backend settings |
--dumpPass |
flag | false | Verbose output for each optimization pass |
--OP |
flag | false | Print all supported operators for the specified framework |
Inputs
- Source model file in one of the supported formats:
- ONNX:
.onnx - TensorFlow:
.pb(frozen graph) - Caffe:
.caffemodel+.prototxt - TFLite:
.tflite - TorchScript:
.pt/.torchscript(requires MNN_BUILD_TORCH) - MNN:
.mnn(for re-optimization or JSON export) - JSON:
.json(MNN JSON representation)
- ONNX:
Outputs
- Primary output -- MNN model file (
.mnn) in FlatBuffers binary format - Optional -- External weight file (
.mnn.weight) when--saveExternalDatais used - Optional -- JSON structure file when
--JsonFileis specified
Usage Examples
Convert ONNX Model
./MNNConvert -f ONNX \
--modelFile model.onnx \
--MNNModel model.mnn \
--bizCode MNN
Convert TensorFlow Model
./MNNConvert -f TF \
--modelFile frozen_graph.pb \
--MNNModel model.mnn \
--bizCode MNN
Convert Caffe Model
./MNNConvert -f CAFFE \
--modelFile model.caffemodel \
--prototxt deploy.prototxt \
--MNNModel model.mnn \
--bizCode MNN
Convert with FP16 Weights and Transformer Fusion
./MNNConvert -f ONNX \
--modelFile transformer_model.onnx \
--MNNModel model.mnn \
--fp16 \
--transformerFuse=true \
--bizCode MNN
Convert with Weight Quantization
./MNNConvert -f ONNX \
--modelFile model.onnx \
--MNNModel model_quant.mnn \
--weightQuantBits 8 \
--weightQuantAsymmetric=true \
--bizCode MNN
Convert and Verify Against Test Data
./MNNConvert -f ONNX \
--modelFile model.onnx \
--MNNModel model.mnn \
--testdir ./test_data/ \
--thredhold 0.01 \
--bizCode MNN
List Supported Operators
./MNNConvert -f ONNX --modelFile model.onnx --OP=true
Internal Conversion Flow
The conversion flow in Cli::convertModel() (from cli.cpp:L673-790) follows these steps:
- Parse source model -- Call the appropriate framework-specific parser (e.g.,
onnx2MNNNet,tensorflow2MNNNet) to populate anMNN::NetTstructure - Set batch size -- If
--batchis specified, update Input ops with unspecified batch dimensions - Run graph optimization -- Call
optimizeNet()with the configured optimize level and passes - Compute unary buffers -- For quantized models, pre-compute lookup tables for unary operations
- Reorder inputs -- Ensure input operator ordering matches the original model
- Serialize to FlatBuffers -- Call
writeFb()to produce the final.mnnfile - Post-conversion test (optional) -- If
--testdiris specified, runCli::testconvert()to verify correctness
Supported Frameworks
The framework identifier (passed via -f) maps to internal enum values in cli.cpp:L360-386:
| Framework Flag | Internal Type | Parser Function |
|---|---|---|
ONNX |
modelConfig::ONNX |
onnx2MNNNet()
|
TF |
modelConfig::TENSORFLOW |
tensorflow2MNNNet()
|
CAFFE |
modelConfig::CAFFE |
caffe2MNNNet()
|
TFLITE |
modelConfig::TFLITE |
tflite2MNNNet()
|
TORCH |
modelConfig::TORCH |
torch2MNNNet() (requires MNN_BUILD_TORCH)
|
MNN |
modelConfig::MNN |
addBizCode() or mnn2json()
|
JSON |
modelConfig::JSON |
json2mnn()
|
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment