Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Ggml org Ggml Image Preprocessing

From Leeroopedia
Revision as of 17:43, 16 February 2026 by Admin (talk | contribs) (Auto-imported from principles/Ggml_org_Ggml_Image_Preprocessing.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Template:KapsoPageMeta

Image Preprocessing

Preparing raw images for neural network input through resize, normalization, and format conversion.

Vision models expect fixed-size, normalized float tensors as input. Raw images arriving in arbitrary resolutions and integer pixel formats must be transformed into the exact layout and value range each model was trained on. This preprocessing bridge is essential: incorrect normalization or resize strategy will silently degrade model accuracy.

Key Operations

The canonical preprocessing pipeline follows this sequence:

  1. Decode image (JPEG/PNG) into raw pixel buffer
  2. Resize with aspect-ratio preservation to the model's expected spatial dimensions
  3. Normalize pixel values to the model-specific floating-point range
  4. Convert to the required float tensor layout (e.g., planar CHW vs. interleaved HWC)

Each step must match the exact conventions used during model training; deviations cause distribution shift at inference time.

Model-Specific Pipelines

SAM (Segment Anything Model)

  • Resize: Bilinear interpolation to 1024x1024 with aspect-ratio preservation (the shorter side is scaled proportionally)
  • Normalization: ImageNet statistics
    • mean = [123.675, 116.28, 103.53]
    • std = [58.395, 57.12, 57.375]
  • Padding: Zero-padding applied to fill the remaining area after aspect-preserving resize
  • Layout: HWC interleaved float32

YOLO

  • Resize: Letterbox resize to 416x416 with 0.5 gray padding to preserve aspect ratio
  • Layout: CHW planar float32
  • Normalization: Pixel values scaled to [0, 1] range

General Observation

Different models require different preprocessing pipelines. There is no universal preprocessing step; each model family defines its own expected input contract. Implementing the wrong pipeline for a given model is a common source of inference errors.

Related

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment