Principle:Ggml org Llama cpp Model Architecture Registry

Knowledge Sources	Ggml_org_Llama_cpp
Domains	Model_Architecture
Last Updated	2026-02-15 00:00 GMT

Overview

The Model Architecture Registry is the principle of maintaining a catalog of supported model architectures and their structural parameters.

Description

This principle covers the system that maps model architecture identifiers (such as "llama", "falcon", "gemma", "qwen2") to their corresponding implementation details. The registry defines each architecture's layer structure, attention mechanism type, normalization approach, and other hyperparameters. It serves as the central lookup point when loading a GGUF model to determine how its tensors should be interpreted and how its computation graph should be constructed.

Usage

Apply this principle when adding support for a new model architecture, or when the model loading pipeline needs to determine the correct graph construction logic based on the architecture identifier stored in a GGUF file.

Theoretical Basis

The architecture registry follows a pattern of mapping string identifiers to structured descriptions. Each architecture entry specifies the model's hyperparameters (hidden size, number of layers, number of attention heads, vocabulary size, etc.), the names and shapes of expected tensors, and the computation graph building logic. This registry-based approach decouples model loading from specific architecture knowledge, allowing new architectures to be added by registering a new entry rather than modifying the loading code itself.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment