Principle:Ggml org Llama cpp Model Architecture Registry
| Knowledge Sources | |
|---|---|
| Domains | Model_Architecture |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
The Model Architecture Registry is the principle of maintaining a catalog of supported model architectures and their structural parameters.
Description
This principle covers the system that maps model architecture identifiers (such as "llama", "falcon", "gemma", "qwen2") to their corresponding implementation details. The registry defines each architecture's layer structure, attention mechanism type, normalization approach, and other hyperparameters. It serves as the central lookup point when loading a GGUF model to determine how its tensors should be interpreted and how its computation graph should be constructed.
Usage
Apply this principle when adding support for a new model architecture, or when the model loading pipeline needs to determine the correct graph construction logic based on the architecture identifier stored in a GGUF file.
Theoretical Basis
The architecture registry follows a pattern of mapping string identifiers to structured descriptions. Each architecture entry specifies the model's hyperparameters (hidden size, number of layers, number of attention heads, vocabulary size, etc.), the names and shapes of expected tensors, and the computation graph building logic. This registry-based approach decouples model loading from specific architecture knowledge, allowing new architectures to be added by registering a new entry rather than modifying the loading code itself.