Principle:Tencent Ncnn Pixel Geometry Transform

Knowledge Sources	Tencent_Ncnn
Domains	Image_Processing, Computer_Vision
Last Updated	2026-02-09 19:00 GMT

Overview

Geometric transformations applied directly to raw pixel buffers, including affine warping, rotation, resizing, and drawing, implemented without external image processing library dependencies.

Description

Pixel geometry transforms are spatial operations that remap pixel locations in an image according to a geometric function. The core operations include:

Affine transformation maps source pixel coordinates to destination coordinates through a 2x3 matrix that can express any combination of rotation, scaling, translation, and shearing. The implementation computes the inverse mapping (destination to source) so that each output pixel can be filled by sampling the input at the corresponding source location, using bilinear interpolation for sub-pixel accuracy.

Rotation is a special case of affine transformation where the matrix encodes a pure rotation about a center point, optionally combined with uniform scaling. The rotation matrix is constructed from angle and scale parameters.

Resize resamples an image to a new resolution using bilinear interpolation, computing source sample coordinates for each destination pixel based on the scaling ratio.

Drawing operations render geometric primitives (lines, rectangles, circles, text) directly onto pixel buffers, enabling visualization of inference results without pulling in a full graphics library.

Android integration provides specialized pixel format conversion paths for Android camera preview buffers (NV21, NV12) that are common on mobile deployment targets.

All operations work on raw unsigned byte pixel data in various channel arrangements (GRAY, RGB, BGR, RGBA, BGRA) and are optionally accelerated with ARM NEON SIMD intrinsics.

Usage

Apply these transforms in the preprocessing stage of an inference pipeline (resizing input images, applying affine alignment for face recognition) and in the postprocessing stage (drawing detection boxes, keypoints, or segmentation overlays on output images). They eliminate the need for OpenCV or similar libraries on resource-constrained devices.

Theoretical Basis

Affine transformation matrix: $[\begin{matrix} x^{'} \\ y^{'} \end{matrix}] = [\begin{matrix} a & b & t_{x} \\ c & d & t_{y} \end{matrix}] [\begin{matrix} x \\ y \\ 1 \end{matrix}]$

Rotation matrix construction:

void get_rotation_matrix(float angle, float scale,
                         float dx, float dy, float* tm)
{
    angle *= (float)(M_PI / 180);
    float alpha = cosf(angle) * scale;
    float beta  = sinf(angle) * scale;
    tm[0] =  alpha;  tm[1] = beta;
    tm[2] = (1.f - alpha) * dx - beta * dy;
    tm[3] = -beta;   tm[4] = alpha;
    tm[5] = beta * dx + (1.f - alpha) * dy;
}

Affine transform from point correspondences:

Given N point pairs (source -> destination), the optimal affine matrix is found by solving the least-squares system:

$𝐌 = (A^{T} A)^{- 1} A^{T} 𝐛$

where A is assembled from source coordinates and b from destination coordinates, yielding the 2x3 transformation matrix.

Bilinear interpolation for sub-pixel sampling:

src_x, src_y = inverse_affine(dst_x, dst_y)
x0, y0 = floor(src_x), floor(src_y)
fx, fy = src_x - x0, src_y - y0
pixel = (1-fx)*(1-fy)*src[y0][x0]   + fx*(1-fy)*src[y0][x0+1]
      + (1-fx)*fy*src[y0+1][x0]     + fx*fy*src[y0+1][x0+1]

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment