Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Ggml org Ggml Sam model load

From Leeroopedia


Template:ImplementationHeader Ggml_org_Ggml_Sam_model_load

Summary

Implementation of vision model weight loading in the GGML ecosystem, primarily through the SAM model loader and the YOLO model loader.

API

SAM Model Loader

bool sam_model_load(const sam_params & params, sam_model & model)

  • Source: examples/sam/sam.cpp:L508-1129 (621 lines)
  • Repository: https://github.com/ggml-org/ggml
  • Parameters:
    • paramssam_params struct containing the model file path and runtime configuration.
    • model — Output sam_model struct, populated by the function.
  • Returns: booltrue on success. On completion, the sam_model is fully populated with all encoder/decoder tensors and a backend buffer is allocated.

YOLO Model Loader

load_model(const std::string & fname, yolo_model & model)

Behavior

Architecture Auto-Detection (SAM)

The SAM loader auto-detects the ViT variant from the n_enc_state dimension read from the model file:

n_enc_state Detected Variant Derived Parameters
768 ViT-B (Base) 12 heads, 12 encoder blocks
1024 ViT-L (Large) 16 heads, 24 encoder blocks
1280 ViT-H (Huge) 16 heads, 32 encoder blocks

Components Loaded (SAM)

The function loads weights for all three major SAM components:

Image Encoder:

  • Patch embedding (convolution weights and biases)
  • Transformer blocks (attention QKV, projection, MLP layers, layer norms)
  • Neck (convolution layers for channel reduction)

Prompt Encoder:

  • Positional encoding (PE) matrix
  • Point embeddings (foreground, background, top-left corner, bottom-right corner)
  • Not-a-point embedding
  • Mask downscaling layers

Mask Decoder:

  • Two-way transformer (self-attention, cross-attention, MLP per layer)
  • Upscaling layers (transposed convolutions, layer norms)
  • Hypernetwork MLPs (one per output mask token)
  • IoU prediction head

Components Loaded (YOLO)

The YOLO loader reads convolutional backbone weights and detection head parameters from a GGUF file, mapping them onto sequential conv/bn layers in the yolo_model struct.

Dependencies

Header Purpose Used By
ggml.h Core tensor operations and graph construction SAM, YOLO
ggml-alloc.h Tensor memory allocation SAM, YOLO
ggml-backend.h Backend buffer management SAM, YOLO
gguf.h GGUF file format reader YOLO
ggml-cpu.h CPU backend initialization SAM, YOLO

Language

  • C++

Domain

Related

Source

Metadata

  • Last updated: 2025-05-15 12:00 GMT

Template:ImplementationFooter

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment