Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ollama Ollama Llama Arch Header

From Leeroopedia
Knowledge Sources
Domains Model Architecture, GGUF
Last Updated 2025-02-15 00:00 GMT

Overview

Header declaring all model architecture enums, GGUF key-value identifiers, tensor name enums, and helper classes for architecture-aware name construction.

Description

Defines llm_arch enum with entries for every supported architecture (LLaMA, Falcon, GPT-2, BERT, Qwen, Gemma, DeepSeek, Mamba, T5, and many more). Defines llm_kv enum for GGUF metadata keys (context length, embedding size, attention heads, rope parameters, tokenizer configuration, etc.) and llm_tensor enum for all tensor types (token embeddings, attention weights, FFN layers, SSM components, etc.). Provides LLM_KV helper that formats architecture-prefixed key strings and LLM_TN / LLM_TN_IMPL for constructing tensor names with layer and expert indices.

Usage

This is the central architecture definition header that every part of llama.cpp's model loading, saving, and graph building depends on. It is the single source of truth for what architectures exist and how their components are named.

Code Reference

Source Location

  • Repository: Ollama
  • File: llama/llama.cpp/src/llama-arch.h
  • Lines: 1-581

Signature

enum llm_arch {
    LLM_ARCH_LLAMA, LLM_ARCH_LLAMA4, LLM_ARCH_FALCON,
    LLM_ARCH_GPT2, LLM_ARCH_BERT, LLM_ARCH_QWEN2,
    LLM_ARCH_GEMMA, LLM_ARCH_DEEPSEEK2, LLM_ARCH_MAMBA,
    LLM_ARCH_T5, LLM_ARCH_RWKV7,
    // ... 100+ architectures
    LLM_ARCH_UNKNOWN,
};

enum llm_kv {
    LLM_KV_GENERAL_ARCHITECTURE, LLM_KV_CONTEXT_LENGTH,
    LLM_KV_EMBEDDING_LENGTH, LLM_KV_BLOCK_COUNT,
    LLM_KV_ATTENTION_HEAD_COUNT, LLM_KV_ROPE_FREQ_BASE,
    // ... 200+ keys
};

enum llm_tensor {
    LLM_TENSOR_TOKEN_EMBD, LLM_TENSOR_OUTPUT,
    LLM_TENSOR_ATTN_Q, LLM_TENSOR_ATTN_K, LLM_TENSOR_ATTN_V,
    LLM_TENSOR_FFN_GATE, LLM_TENSOR_FFN_DOWN, LLM_TENSOR_FFN_UP,
    // ... 80+ tensor types
};

struct LLM_KV {
    LLM_KV(llm_arch arch);
    std::string operator()(llm_kv kv) const;
};

struct LLM_TN {
    LLM_TN(llm_arch arch);
    std::string operator()(llm_tensor tensor, const char * suffix, int bid = -1, int xid = -1) const;
};

Import

#include "llama-arch.h"

I/O Contract

Inputs

Name Type Required Description
arch llm_arch Yes Architecture to construct names for
kv llm_kv Yes Key-value identifier to look up
tensor llm_tensor Yes Tensor type to construct name for
bid int No Block/layer index (default: -1 for no layer)

Outputs

Name Type Description
key_string std::string Architecture-prefixed GGUF key string
tensor_name std::string Tensor name with layer and expert indices

Usage Examples

#include "llama-arch.h"

// Check architecture type
if (arch == LLM_ARCH_LLAMA) {
    // LLaMA-specific handling
}

// Construct key names
LLM_KV kv(arch);
std::string ctx_key = kv(LLM_KV_CONTEXT_LENGTH);

// Construct tensor names
LLM_TN tn(arch);
std::string q_weight = tn(LLM_TENSOR_ATTN_Q, "weight", /*layer=*/5);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment