Implementation:Alibaba MNN Test Script

Metadata

Source Repository	https://github.com/alibaba/MNN
Source File	`test.sh` (761 lines)
Language	Bash
Domains	Testing, CI_CD
Last Updated	2026-02-10

Summary

test.sh is the comprehensive CI/CD test script for the MNN framework. It serves as the central orchestrator for build validation and test execution across multiple platforms and configurations. The script supports four operational modes: local (developer machine testing), linux (full CI pipeline with coverage), android (cross-compiled ARM testing), and static (code quality checks only). It validates unit tests, model inference tests, format converter tests (ONNX, TensorFlow, TFLite, PyTorch), quantization tests, LLM inference, OpenCV integration, and Python binding tests.

I/O Contract

Input	Output
CLI mode argument: `local`, `linux`, `android`, or `static`	Test results with structured pass/fail counts, coverage reports (linux mode)
Environment: `ANDROID_NDK` path (android mode)	Build artifacts, test binaries, ADB-based test execution
Git history (change detection)	Conditional test execution based on changed file paths

Main Dispatch (L716-761)

The script's entry point dispatches to different test pipelines based on the first CLI argument:

case "$1" in
    local)
        pushd build
        unit_test
        model_test
        onnx_convert_test
        tf_convert_test
        tflite_convert_test
        torch_convert_test
        ptq_test
        pymnn_test
        ;;
    linux)
        doc_check
        static_check
        py_check
        linux_build 1
        coverage_init
        unit_test
        model_test
        onnx_convert_test
        tf_convert_test
        tflite_convert_test
        torch_convert_test
        ptq_test
        pymnn_test
        opencv_test
        llm_test
        coverage_report
        ;;
    android)
        android_static_build
        android_test
        ;;
    static)
        doc_check
        static_check
        py_check
        echo_static_success
        ;;
    *)
        $1
        echo $"Usage: $0 {local|linux|android|func}"
        exit 2
esac

The wildcard case (*) allows invoking any individual function by name, providing a flexible interface for running specific test stages.

Change Detection

The script uses git show --name-only to detect which files changed in the latest commit, enabling conditional test execution:

SOURCE_CHANGE=$(git show --name-only | grep -E "^source/(internal|backend|core|common|cv|geometry|math|plugin|shape|utils)/.*\.(cpp|cc|c|hpp)$" | \
                grep -Ev "aliyun-log-c-sdk|hiai|tensorrt|Backend|FunctionDispatcher|ThreadPool")
PYMNN_CHANGE=$(git show --name-only | grep -E "^pymnn/.*\.(cpp|cc|c|h|hpp|py)$")
PY_CHANGE=$(git show --name-only | grep -E "^pymnn/pip_package/MNN/.*\.(py)$")
OPENCV_CHANGE=$(git show --name-only | grep -E "^tools/cv/.*\.(cpp|cc|c|h|hpp)$")
OPENCL_CHANGE=true

Tests for PyMNN, OpenCV, and static analysis are skipped when their respective source files have not changed, reducing CI execution time.

Key Functions

unit_test() (L274-293)

unit_test() {
    ./run_test.out
    if [ $? -ne 0 ]; then
        echo '### Unit test failed!'
        failed
    fi
    ./run_test.out op 0 0 4
    if [ $? -ne 0 ]; then
        echo '### Multi-threaded unit test failed!'
        failed
    fi
    if [ "$OPENCL_CHANGE" ]; then
        ./run_test.out op 3 1 4
        if [ $? -ne 0 ]; then
            echo '### OpenCL unit test failed!'
            failed
        fi
    fi
}

Runs unit tests in three configurations: single-threaded, multi-threaded (4 threads), and OpenCL backend (if OpenCL changes are detected).

model_test() (L295-315)

model_test() {
    ../tools/script/modelTest.py ~/AliNNModel 0 0.002
    if [ $? -ne 0 ]; then
        echo '### Model test failed!'
        failed
    fi
    ../tools/script/modelTest.py ~/AliNNModel 0 0.002 0 1
    if [ $? -ne 0 ]; then
        echo '### Static model test failed!'
        failed
    fi
    if [ "$OPENCL_CHANGE" ]; then
        ../tools/script/modelTest.py ~/AliNNModel 3 0.002 1
    fi
}

Validates model inference against reference outputs with a tolerance of 0.002. Tests both dynamic and static model configurations, plus OpenCL backend when applicable.

onnx_convert_test() (L317-323)

onnx_convert_test() {
    ../tools/script/convertOnnxTest.py ~/AliNNModel
    if [ $? -ne 0 ]; then
        echo '### ONNX convert test failed!'
        failed
    fi
}

Validates ONNX model conversion to MNN format. Similar functions exist for TensorFlow (tf_convert_test), TFLite (tflite_convert_test), and PyTorch (torch_convert_test).

linux_build() (L227-272)

linux_build() {
    if [ $# -gt 0 ]; then
        COVERAGE=ON
    else
        COVERAGE=OFF
    fi
    mkdir build_non_sse
    pushd build_non_sse
    cmake .. -DCMAKE_CXX_COMPILER_LAUNCHER=ccache -DMNN_USE_SSE=OFF && make -j16
    # ...
    mkdir build
    pushd build
    cmake .. \
        -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \
        -DCMAKE_BUILD_TYPE=Release \
        -DMNN_BUILD_TEST=ON \
        -DMNN_CUDA=ON \
        -DMNN_OPENCL=ON \
        -DMNN_BUILD_QUANTOOLS=ON \
        -DMNN_BUILD_DEMO=ON \
        -DMNN_BUILD_TRAIN=ON \
        -DMNN_BUILD_CONVERTER=ON \
        -DMNN_BUILD_TORCH=ON \
        -DMNN_BUILD_OPENCV=ON \
        -DMNN_LOW_MEMORY=ON \
        -DMNN_IMGCODECS=ON \
        -DMNN_SUPPORT_TRANSFORMER_FUSE=ON \
        -DMNN_ENABLE_COVERAGE=$COVERAGE
    make -j16
}

Performs two Linux builds: one without SSE (to verify portability) and one full build with all features enabled (CUDA, OpenCL, training, converter, OpenCV, transformer fuse, and optionally coverage instrumentation). Uses ccache for build acceleration.

android_test() (L671-714)

android_test() {
    pushd project/android
    # 1. Build Android32
    mkdir build_32
    pushd build_32
    ../build_32.sh -DMNN_BUILD_TRAIN=OFF -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \
                   -DMNN_OPENCL=true -DMNN_LOW_MEMORY=ON \
                   -DMNN_SUPPORT_TRANSFORMER_FUSE=ON -DMNN_ARM82=OFF
    # ... push to device and run tests
    android_unit_test 32bit 1
    android_unit_test_low_memory_armv7 32bit
    android_model_test 32
    popd
    # 2. Build Android64
    mkdir build_64
    pushd build_64
    ../build_64.sh -DMNN_BUILD_TRAIN=OFF -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \
                   -DMNN_ARM82=true -DMNN_OPENCL=true -DMNN_LOW_MEMORY=true \
                   -DMNN_SUPPORT_TRANSFORMER_FUSE=ON
    # ... push to device and run tests
    android_unit_test 64 0
    android_unit_test_low_memory_armv8 64
    android_model_test 64
    popd
    popd
}

Builds and tests MNN on Android for both 32-bit (ARMv7) and 64-bit (ARMv8) architectures. Tests include standard unit tests, low-memory mode tests, and model inference tests. The 64-bit build enables ARM82 (half-precision) extensions.

pymnn_test() (L357-397)

pymnn_test() {
    if [ -z "$PYMNN_CHANGE" ]; then
        return
    fi
    # 1. Build pymnn
    pushd pymnn/pip_package
    python3 build_deps.py
    pip uninstall --yes MNN MNN-Internal
    python3 setup.py install --version 1.0 --install-lib=/usr/lib/python3/dist-packages
    popd
    # 2. Unit test
    pushd pymnn/test
    python3 unit_test.py
    # 3. Model test
    python3 model_test.py ~/AliNNModel
    # 4. Train test
    ./train_test.sh
    # 5. Uninstall
    pip uninstall --yes MNN-Internal
    popd
}

Conditionally tests the Python MNN bindings (PyMNN) when Python source files have changed. Covers building, unit tests, model tests, and training tests.

llm_test() (L421-438)

llm_test() {
    cmake -DMNN_LOW_MEMORY=ON -DMNN_BUILD_LLM=ON -DMNN_SUPPORT_TRANSFORMER_FUSE=ON ..
    make -j8
    ./llm_demo ~/AliNNModel/qwen1.5-0.5b-int4/config.json \
               ~/AliNNModel/qwen1.5-0.5b-int4/prompt.txt
}

Builds MNN with LLM support enabled (low memory mode, transformer fuse) and runs inference on a Qwen 1.5 0.5B INT4 quantized model.

coverage_report() (L446-471)

coverage_report() {
    popd
    lcov -c -d ./ -o cover.info
    lcov -a init.info -a cover.info -o total.info
    lcov --remove total.info \
        '*/usr/include/*' '*/usr/lib/*' '*/usr/lib64/*' '*/usr/local/*' \
        '*/3rd_party/*' '*/build/*' '*/schema/*' '*/test/*' '/tmp/*' \
        '*/demo/*' '*/tools/cpp/*' '*/tools/train/*' '*/source/backend/cuda/*' \
        -o final.info
    commitId=$(git log | head -n1 | awk '{print $2}')
    genhtml -o cover_report --legend \
        --title "MNN Coverage Report [commit SHA1:${commitId}]" \
        --prefix=$(pwd) final.info
}

Generates an HTML coverage report using lcov and genhtml. Excludes system headers, third-party code, build artifacts, test code, and CUDA backend from the coverage metrics. The report is titled with the current commit SHA1.

Error Handling

All test functions use a common failed() helper that outputs structured test result metadata and exits with code 1:

failed() {
    printf "TEST_NAME_EXCEPTION: Exception\nTEST_CASE_AMOUNT_EXCEPTION: {\"blocked\":0,\"failed\":1,\"passed\":0,\"skipped\":0}\n"
    exit 1
}

This structured output format allows CI systems to parse test results programmatically.

Test Matrix

Mode	Static Check	Build	Unit Tests	Model Tests	Converter Tests	PyMNN	OpenCV	LLM	Coverage
`local`	No	Pre-built	Yes	Yes	ONNX, TF, TFLite, Torch	Yes	No	No	No
`linux`	Yes	Full + non-SSE	Yes	Yes	ONNX, TF, TFLite, Torch	Conditional	Conditional	Yes	Yes
`android`	No	ARM32 + ARM64	Yes (ADB)	Yes (ADB)	No	No	No	No	No
`static`	Yes	No	No	No	No	No	No	No	No

Related Pages

Alibaba_MNN_Continuous_Integration_Testing -- The principle of automated multi-platform testing for the MNN framework

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment