Principle:Alibaba MNN Conversion Verification

Field	Value
Principle Name	Conversion_Verification
Category	Model_Conversion_Pipeline
Description	Verifying numerical correctness of model conversion through reference comparison
Applies To	Post-conversion validation stage

Overview

After converting a model from its original framework format (e.g., ONNX, TensorFlow) to MNN's .mnn format, it is essential to verify that the conversion was numerically correct. Conversion verification ensures that the converted model produces outputs that match the original model's outputs within acceptable tolerance thresholds, catching operator mapping errors, optimization bugs, and precision loss before the model reaches production.

Theory: Reference-Based Comparison

The fundamental approach to conversion verification is reference comparison: running identical inputs through both the original model (using the source framework's runtime) and the converted MNN model (using MNN's inference engine), then comparing their outputs element-by-element.

Test Data Generation

The verification process requires:

Input data -- A set of input tensors with values representative of real inference workloads. These can be:
- Randomly generated values within the expected input range
- A subset of real validation data
- Carefully crafted edge-case inputs
Reference output data -- The outputs produced by running the input data through the original model using the source framework's own inference runtime (e.g., ONNX Runtime for ONNX models)

Error Metrics

The comparison between reference outputs and MNN outputs uses the following metrics:

absMaxV -- The absolute maximum value in the reference output tensor. This serves as a normalization factor to determine relative error significance.
DiffMax -- The maximum absolute difference between any corresponding elements in the reference and MNN output tensors.
Relative error threshold -- The test passes if DiffMax < absMaxV * threshold, where the threshold (typically 0.01 or 1%) accounts for acceptable floating-point precision differences.

Why Tolerance Is Needed

Exact numerical equality between the original and converted models is generally not achievable due to:

Floating-point non-associativity -- Different computation orders produce different rounding results. MNN may reorder operations during optimization.
Operator implementation differences -- MNN's operator kernels may use different algorithms or approximations than the source framework.
Precision reduction -- FP16 storage, weight quantization, and other compression techniques intentionally sacrifice precision for size or speed.
Platform-specific behavior -- Denormalized float handling differs between platforms. MNN's converter can optionally align denormalized values to zero (alignDenormalizedValue).

Special Cases

Infinity and NaN detection -- Any output containing infinity or NaN values is flagged as a test failure regardless of the threshold, since these typically indicate a conversion or computation error.
Zero-size tensors -- Tensors with zero elements are skipped during verification since there is nothing to compare.
Format conversion -- If the MNN model outputs in NC4HW4 internal format, the verification process first converts to the model's default format (NCHW or NHWC) before comparison.
Type casting -- Non-float output tensors are cast to float before comparison.

Verification Strategies

Full Model Verification

The simplest approach: run the complete model end-to-end and compare final outputs. This catches any error that propagates to the model's outputs but cannot pinpoint which operator caused the issue.

Layer-by-Layer Verification

For debugging conversion failures, individual intermediate layers can be tested by:

Modifying the source model to expose intermediate outputs
Running the model up to that layer in both frameworks
Comparing the intermediate outputs

This is particularly useful with ONNX models where the graph structure makes it straightforward to select arbitrary nodes as outputs.

Binary Search Debugging

When a full-model test fails, a binary search strategy can efficiently locate the first operator that produces incorrect results. This works by:

Building the model's dominator tree (using the Lengauer-Tarjan algorithm)
Testing the output at the midpoint of the dominator path between a known-good and known-bad node
Recursively narrowing the search range

This approach reduces the number of test runs from O(n) to O(log n) where n is the number of operators in the graph.

Test Data Format

The standard test data directory structure used by MNN verification tools:

test_dir/
  input.json          # JSON metadata: input names, shapes, output names
  input_name.txt      # Flattened input tensor values (one per line)
  output_name.txt     # Flattened reference output values (one per line)

The input.json file has the following structure:

{
    "inputs": [
        {"name": "input_0", "shape": [1, 3, 224, 224]},
        {"name": "input_1", "shape": [1, 10], "value": 1.0}
    ],
    "outputs": ["output_0", "output_1"]
}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment