Implementation:Ggml org Ggml Sam model load

Template:ImplementationHeader Ggml_org_Ggml_Sam_model_load

Summary

Implementation of vision model weight loading in the GGML ecosystem, primarily through the SAM model loader and the YOLO model loader.

API

SAM Model Loader

bool sam_model_load(const sam_params & params, sam_model & model)

Source: examples/sam/sam.cpp:L508-1129 (621 lines)
Repository: https://github.com/ggml-org/ggml
Parameters:
- params — sam_params struct containing the model file path and runtime configuration.
- model — Output sam_model struct, populated by the function.
Returns: bool — true on success. On completion, the sam_model is fully populated with all encoder/decoder tensors and a backend buffer is allocated.

YOLO Model Loader

load_model(const std::string & fname, yolo_model & model)

Source: examples/yolo/yolov3-tiny.cpp:L77-138
Repository: https://github.com/ggml-org/ggml
Note: Uses the GGUF reader for weight deserialization.

Behavior

Architecture Auto-Detection (SAM)

The SAM loader auto-detects the ViT variant from the n_enc_state dimension read from the model file:

n_enc_state	Detected Variant	Derived Parameters
768	ViT-B (Base)	12 heads, 12 encoder blocks
1024	ViT-L (Large)	16 heads, 24 encoder blocks
1280	ViT-H (Huge)	16 heads, 32 encoder blocks

Components Loaded (SAM)

The function loads weights for all three major SAM components:

Image Encoder:

Patch embedding (convolution weights and biases)
Transformer blocks (attention QKV, projection, MLP layers, layer norms)
Neck (convolution layers for channel reduction)

Prompt Encoder:

Positional encoding (PE) matrix
Point embeddings (foreground, background, top-left corner, bottom-right corner)
Not-a-point embedding
Mask downscaling layers

Mask Decoder:

Two-way transformer (self-attention, cross-attention, MLP per layer)
Upscaling layers (transposed convolutions, layer norms)
Hypernetwork MLPs (one per output mask token)
IoU prediction head

Components Loaded (YOLO)

The YOLO loader reads convolutional backbone weights and detection head parameters from a GGUF file, mapping them onto sequential conv/bn layers in the yolo_model struct.

Dependencies

Header	Purpose	Used By
`ggml.h`	Core tensor operations and graph construction	SAM, YOLO
`ggml-alloc.h`	Tensor memory allocation	SAM, YOLO
`ggml-backend.h`	Backend buffer management	SAM, YOLO
`gguf.h`	GGUF file format reader	YOLO
`ggml-cpu.h`	CPU backend initialization	SAM, YOLO

Language

C++

Domain

Source

GGML

Metadata

Last updated: 2025-05-15 12:00 GMT

Template:ImplementationFooter

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment