Principle:Tencent Ncnn Foreign Model Conversion
| Knowledge Sources | |
|---|---|
| Domains | Model_Conversion, Deep_Learning_Deployment |
| Last Updated | 2026-02-09 19:00 GMT |
Overview
Translating neural network model definitions and weights from foreign framework-specific serialization formats into a unified target representation suitable for on-device inference.
Description
Foreign model conversion is the process of reading a trained neural network graph and its associated weight tensors from one framework's proprietary format and rewriting them into a different framework's format. Each source framework (Caffe, ONNX, Darknet, MXNet, TensorFlow/MLIR) uses its own serialization scheme: Caffe stores topology in protobuf text (.prototxt) and weights in binary protobuf (.caffemodel); ONNX uses a single protobuf binary (.onnx); Darknet uses INI-style configuration (.cfg) paired with raw float weights (.weights); and TensorFlow graphs can be represented in MLIR intermediate form. The converter must map each source operator to a semantically equivalent operator in the target representation, reorder or reshape weight tensors to match the target layout conventions, and emit a pair of output files: a human-readable parameter file (.param) describing the graph topology and a binary file (.bin) containing packed weight data.
The primary challenge is that different frameworks define the same logical operation with different parameter conventions (e.g., padding modes, axis ordering, broadcast semantics), so each converter must implement a per-operator translation table with appropriate parameter remapping.
Usage
Apply this principle whenever deploying a model trained in a research or cloud framework to a mobile or embedded inference engine. The conversion step is typically performed offline as a build-time preprocessing step, producing compact model files that the runtime loads directly.
Theoretical Basis
General conversion pipeline:
1. Parse source format (protobuf / INI / MLIR)
2. Build in-memory directed acyclic graph (DAG) of operators
3. For each operator:
a. Look up target operator name from mapping table
b. Remap parameters (kernel_size, stride, padding, dilation, groups, ...)
c. Transpose / reshape weight tensors to target layout order
4. Topologically sort the DAG
5. Emit .param (layer_type, name, bottom_blobs, top_blobs, param_dict)
6. Emit .bin (raw weight data in target byte order)
Protobuf-based parsing (Caffe, ONNX):
// Read protobuf model from binary file
google::protobuf::io::IstreamInputStream input(&fs);
google::protobuf::io::CodedInputStream codedstr(&input);
codedstr.SetTotalBytesLimit(INT_MAX);
bool success = message->ParseFromCodedStream(&codedstr);
Darknet INI-style config parsing:
[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky
Each section maps to one or more target layers with appropriate parameter translation.