Implementation:Alibaba MNN PyMNN CV Preprocessing

Field	Value
implementation_name	PyMNN_CV_Preprocessing
schema_version	0.1.0
workflow	Python_Model_Inference
implementation_type	API_Doc
domain	Deep_Learning_Inference
scope	Image loading, resizing, normalization, and data format conversion for model input
source_file	docs/pymnn/cv.md (Python API reference)
related_patterns	OpenCV_Compatibility, Fused_Preprocessing, Tensor_Format_Conversion
last_updated	2026-02-10 14:00 GMT

Summary

This implementation documents the MNN Python APIs used for preprocessing raw image data into model-ready tensors. The primary functions are cv.imread for loading images, cv.resize for resizing with optional fused normalization, and expr.convert for data format conversion. These APIs are OpenCV-compatible in their interface design.

API Signatures

cv.imread

cv.imread(filename, flag=cv.IMREAD_COLOR) -> Var

Reads an image file from disk and returns it as a Var.

cv.imdecode

cv.imdecode(buf, flag=cv.IMREAD_COLOR) -> Var

Decodes an image from an in-memory buffer (ndarray, list, tuple, or bytes).

cv.resize

cv.resize(src, dsize, fx=0, fy=0, interpolation=cv.INTER_LINEAR, code=-1, mean=[], norm=[]) -> Var

Resizes an image. Optionally performs fused color conversion, mean subtraction, and normalization.

expr.convert

expr.convert(x, format) -> Var

Converts the data_format of a Var between NCHW, NHWC, and NC4HW4.

Parameters

cv.imread Parameters

Parameter	Type	Default	Description
filename	str	(required)	Path to the image file (JPEG, PNG, BMP, etc.)
flag	int	cv.IMREAD_COLOR	Read mode: cv.IMREAD_GRAYSCALE (grayscale uint8), cv.IMREAD_COLOR (BGR uint8), cv.IMREAD_ANYDEPTH (BGR float32)

cv.resize Parameters

Parameter	Type	Default	Description
src	Var	(required)	Input image Var
dsize	tuple	(required)	Target size as (width, height)
fx	float	0	Horizontal scale factor; if 0, computed from dsize
fy	float	0	Vertical scale factor; if 0, computed from dsize
interpolation	int	cv.INTER_LINEAR	Interpolation method: INTER_NEAREST, INTER_LINEAR, INTER_CUBIC, INTER_AREA, INTER_LANCZOS4
code	int	-1	Color conversion code (e.g., cv.COLOR_BGR2RGB); -1 means no conversion
mean	[float]	[]	Per-channel mean subtraction values; also triggers cast to float32
norm	[float]	[]	Per-channel normalization scale factors (multiplied after mean subtraction)

expr.convert Parameters

Parameter	Type	Default	Description
x	Var	(required)	Input variable to convert
format	data_format	(required)	Target format: expr.NCHW, expr.NHWC, or expr.NC4HW4

Inputs

Raw image file (JPEG, PNG, BMP) on disk, or image bytes in memory
Alternatively, a numpy array or MNN Var containing raw pixel data

Outputs

Preprocessed Var tensor in the target data format (typically NC4HW4), with:
- Shape: [1, C, H, W] (batch dimension added)
- dtype: float32
- Pixel values normalized according to the specified mean and norm parameters

Code Example

import MNN.cv as cv
import MNN.numpy as np
import MNN.expr as expr

# Step 1: Load image (returns Var with shape [H, W, 3], dtype uint8, NHWC)
image = cv.imread('cat.jpg')

# Step 2: Resize to 224x224 with fused normalization
#   - Resizes to 224x224
#   - Subtracts mean [103.94, 116.78, 123.68] per channel (BGR)
#   - Multiplies by norm [0.017, 0.017, 0.017] per channel
#   - Automatically casts to float32
image = cv.resize(image, (224, 224),
                  mean=[103.94, 116.78, 123.68],
                  norm=[0.017, 0.017, 0.017])

# Step 3: Add batch dimension: [224, 224, 3] -> [1, 224, 224, 3]
input_var = np.expand_dims(image, 0)

# Step 4: Convert from NHWC to NC4HW4 for the inference engine
input_var = expr.convert(input_var, expr.NC4HW4)

# input_var is now ready for model.forward(input_var)

Alternative: Manual Step-by-Step Preprocessing

import MNN.cv as cv
import MNN.numpy as np
import MNN.expr as expr

# Load image
img = cv.imread('cat.jpg')

# Resize without normalization
img = cv.resize(img, (224, 224))

# Manual type conversion and normalization
imgf = img.astype(np.float32)
imgf = (imgf - np.array([103.94, 116.78, 123.68])) * np.array([0.017, 0.017, 0.017])

# Add batch dimension and convert format
input_var = np.expand_dims(imgf, 0)
input_var = expr.convert(input_var, expr.NC4HW4)

Edge Cases and Limitations

cv.imread is not available on mobile by default: Enable the PYMNN_IMGCODECS build flag to include image codec support on mobile platforms
Do not use transpose for layout changes: Using numpy-style transpose to rearrange dimensions from HWC to CHW will produce incorrect results; always use expr.convert which handles the internal memory layout correctly
Color order: cv.imread returns BGR images by default (matching OpenCV convention); use the code parameter in cv.resize or cv.cvtColor to convert to RGB if the model expects RGB input
Empty mean/norm: When mean and norm are both empty (default), cv.resize does not cast to float32; the output remains uint8

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment