Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Alibaba MNN PyMNN CV Preprocessing

From Leeroopedia


Field Value
implementation_name PyMNN_CV_Preprocessing
schema_version 0.1.0
workflow Python_Model_Inference
implementation_type API_Doc
domain Deep_Learning_Inference
scope Image loading, resizing, normalization, and data format conversion for model input
source_file docs/pymnn/cv.md (Python API reference)
related_patterns OpenCV_Compatibility, Fused_Preprocessing, Tensor_Format_Conversion
last_updated 2026-02-10 14:00 GMT

Summary

This implementation documents the MNN Python APIs used for preprocessing raw image data into model-ready tensors. The primary functions are cv.imread for loading images, cv.resize for resizing with optional fused normalization, and expr.convert for data format conversion. These APIs are OpenCV-compatible in their interface design.

API Signatures

cv.imread

cv.imread(filename, flag=cv.IMREAD_COLOR) -> Var

Reads an image file from disk and returns it as a Var.

cv.imdecode

cv.imdecode(buf, flag=cv.IMREAD_COLOR) -> Var

Decodes an image from an in-memory buffer (ndarray, list, tuple, or bytes).

cv.resize

cv.resize(src, dsize, fx=0, fy=0, interpolation=cv.INTER_LINEAR, code=-1, mean=[], norm=[]) -> Var

Resizes an image. Optionally performs fused color conversion, mean subtraction, and normalization.

expr.convert

expr.convert(x, format) -> Var

Converts the data_format of a Var between NCHW, NHWC, and NC4HW4.

Parameters

cv.imread Parameters

Parameter Type Default Description
filename str (required) Path to the image file (JPEG, PNG, BMP, etc.)
flag int cv.IMREAD_COLOR Read mode: cv.IMREAD_GRAYSCALE (grayscale uint8), cv.IMREAD_COLOR (BGR uint8), cv.IMREAD_ANYDEPTH (BGR float32)

cv.resize Parameters

Parameter Type Default Description
src Var (required) Input image Var
dsize tuple (required) Target size as (width, height)
fx float 0 Horizontal scale factor; if 0, computed from dsize
fy float 0 Vertical scale factor; if 0, computed from dsize
interpolation int cv.INTER_LINEAR Interpolation method: INTER_NEAREST, INTER_LINEAR, INTER_CUBIC, INTER_AREA, INTER_LANCZOS4
code int -1 Color conversion code (e.g., cv.COLOR_BGR2RGB); -1 means no conversion
mean [float] [] Per-channel mean subtraction values; also triggers cast to float32
norm [float] [] Per-channel normalization scale factors (multiplied after mean subtraction)

expr.convert Parameters

Parameter Type Default Description
x Var (required) Input variable to convert
format data_format (required) Target format: expr.NCHW, expr.NHWC, or expr.NC4HW4

Inputs

  • Raw image file (JPEG, PNG, BMP) on disk, or image bytes in memory
  • Alternatively, a numpy array or MNN Var containing raw pixel data

Outputs

  • Preprocessed Var tensor in the target data format (typically NC4HW4), with:
    • Shape: [1, C, H, W] (batch dimension added)
    • dtype: float32
    • Pixel values normalized according to the specified mean and norm parameters

Code Example

import MNN.cv as cv
import MNN.numpy as np
import MNN.expr as expr

# Step 1: Load image (returns Var with shape [H, W, 3], dtype uint8, NHWC)
image = cv.imread('cat.jpg')

# Step 2: Resize to 224x224 with fused normalization
#   - Resizes to 224x224
#   - Subtracts mean [103.94, 116.78, 123.68] per channel (BGR)
#   - Multiplies by norm [0.017, 0.017, 0.017] per channel
#   - Automatically casts to float32
image = cv.resize(image, (224, 224),
                  mean=[103.94, 116.78, 123.68],
                  norm=[0.017, 0.017, 0.017])

# Step 3: Add batch dimension: [224, 224, 3] -> [1, 224, 224, 3]
input_var = np.expand_dims(image, 0)

# Step 4: Convert from NHWC to NC4HW4 for the inference engine
input_var = expr.convert(input_var, expr.NC4HW4)

# input_var is now ready for model.forward(input_var)

Alternative: Manual Step-by-Step Preprocessing

import MNN.cv as cv
import MNN.numpy as np
import MNN.expr as expr

# Load image
img = cv.imread('cat.jpg')

# Resize without normalization
img = cv.resize(img, (224, 224))

# Manual type conversion and normalization
imgf = img.astype(np.float32)
imgf = (imgf - np.array([103.94, 116.78, 123.68])) * np.array([0.017, 0.017, 0.017])

# Add batch dimension and convert format
input_var = np.expand_dims(imgf, 0)
input_var = expr.convert(input_var, expr.NC4HW4)

Edge Cases and Limitations

  • cv.imread is not available on mobile by default: Enable the PYMNN_IMGCODECS build flag to include image codec support on mobile platforms
  • Do not use transpose for layout changes: Using numpy-style transpose to rearrange dimensions from HWC to CHW will produce incorrect results; always use expr.convert which handles the internal memory layout correctly
  • Color order: cv.imread returns BGR images by default (matching OpenCV convention); use the code parameter in cv.resize or cv.cvtColor to convert to RGB if the model expects RGB input
  • Empty mean/norm: When mean and norm are both empty (default), cv.resize does not cast to float32; the output remains uint8

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment