Implementation:Ggml org Llama cpp Arch Header

Knowledge Sources	Ggml_org_Llama_cpp
Domains	Model_Architecture
Last Updated	2026-02-15 00:00 GMT

Overview

Declares enumerations and helper types for model architectures, GGUF metadata keys, and tensor identifiers used throughout the llama.cpp codebase.

Description

Defines the `llm_arch` enum (100+ architecture variants including LLaMA, Falcon, GPT-2, Qwen, Gemma, Mamba, RWKV, and many more), `llm_kv` enum (metadata keys like context length, embedding dimensions, expert counts), and `llm_tensor` enum (tensor names for attention, FFN, SSM, and other components). Also provides the `LLM_KV` helper struct for constructing architecture-prefixed GGUF key strings, and the `LLM_TN` / `LLM_TN_IMPL` helpers for constructing layer-indexed tensor name strings (e.g., "blk.3.attn_norm.weight").

Usage

This is a core header imported by virtually every other source file in the project. Use these enums and helpers to identify model components in a type-safe manner when working with GGUF files and model tensors.

Code Reference

Source Location

Repository: Ggml_org_Llama_cpp
File: src/llama-arch.h
Lines: 1-614

Signature

// Architecture enumeration (100+ variants)
enum llm_arch {
    LLM_ARCH_CLIP,
    LLM_ARCH_LLAMA,
    LLM_ARCH_LLAMA4,
    LLM_ARCH_FALCON,
    LLM_ARCH_GPT2,
    LLM_ARCH_QWEN2,
    LLM_ARCH_GEMMA,
    LLM_ARCH_MAMBA,
    // ... 100+ more
    LLM_ARCH_UNKNOWN,
};

// Metadata key enumeration
enum llm_kv {
    LLM_KV_GENERAL_TYPE,
    LLM_KV_GENERAL_ARCHITECTURE,
    LLM_KV_GENERAL_NAME,
    LLM_KV_CONTEXT_LENGTH,
    LLM_KV_EMBEDDING_LENGTH,
    LLM_KV_BLOCK_COUNT,
    // ... many more
};

// Tensor name enumeration
enum llm_tensor {
    LLM_TENSOR_TOKEN_EMBD,
    LLM_TENSOR_OUTPUT_NORM,
    LLM_TENSOR_OUTPUT,
    LLM_TENSOR_ATTN_NORM,
    LLM_TENSOR_ATTN_Q,
    LLM_TENSOR_ATTN_K,
    LLM_TENSOR_ATTN_V,
    LLM_TENSOR_ATTN_OUT,
    LLM_TENSOR_FFN_GATE,
    LLM_TENSOR_FFN_DOWN,
    LLM_TENSOR_FFN_UP,
    // ... many more
};

// Helper for constructing architecture-prefixed GGUF keys
struct LLM_KV {
    LLM_KV(llm_arch arch);
    std::string operator()(llm_kv kv) const;
};

// Helper for constructing layer-indexed tensor names
struct LLM_TN_IMPL { ... };
struct LLM_TN {
    LLM_TN(llm_arch arch);
    LLM_TN_IMPL operator()(llm_tensor tensor, const char * suffix = nullptr, int bid = -1, int xid = -1) const;
};

// Tensor metadata
struct llm_tensor_info {
    llm_tensor_layer layer;
    ggml_op op;
};

Import

#pragma once
#include "ggml.h"
#include <string>
#include <set>

I/O Contract

Inputs

Name	Type	Required	Description
arch	llm_arch	Yes	Architecture enum value for constructing prefixed keys or tensor names
kv	llm_kv	Yes	Metadata key enum value (for LLM_KV helper)
tensor	llm_tensor	Yes	Tensor name enum value (for LLM_TN helper)
bid	int	No	Block/layer index for layer-specific tensor names (default: -1 for non-layered)
xid	int	No	Expert index for MoE tensor names (default: -1 for non-expert)

Outputs

Name	Type	Description
key_string	std::string	Architecture-prefixed GGUF metadata key string (e.g., "llama.context_length")
tensor_name	std::string	Layer-indexed tensor name string (e.g., "blk.3.attn_norm.weight")

Usage Examples

// Construct architecture-prefixed GGUF key
LLM_KV kv(LLM_ARCH_LLAMA);
std::string ctx_key = kv(LLM_KV_CONTEXT_LENGTH);
// Result: "llama.context_length"

// Construct layer-indexed tensor name
LLM_TN tn(LLM_ARCH_LLAMA);
auto name = tn(LLM_TENSOR_ATTN_Q, "weight", 3);
// Result: "blk.3.attn_q.weight"

// Check if an architecture enum is known
if (arch != LLM_ARCH_UNKNOWN) {
    // Valid architecture found in GGUF file
}

Related Pages

Principle:Ggml_org_Llama_cpp_Model_Architecture_Support

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment