Implementation:Ollama Ollama Llama Model Header
| Knowledge Sources | |
|---|---|
| Domains | LLM Inference, Model Loading |
| Last Updated | 2025-02-15 00:00 GMT |
Overview
Header declaring the llama_model struct, layer structures, and model size type enumeration for all supported LLM architectures.
Description
Defines llm_type enum with entries for every supported model size (14M through 671B+ and MoE types). Declares layer structures: llama_layer with tensor pointers for all possible layer components (attention Q/K/V/O, FFN gate/up/down, normalization, MoE expert weights, SSM states, etc.), plus specialized structures for PosNet, ConvNext, ShortConv, and NextN layers. llama_model aggregates architecture, hyperparameters, vocabulary, layers vector, embeddings, memory mappings, and backend devices.
Usage
Include this header when working with the model structure, accessing layer tensors, or implementing new architectures.
Code Reference
Source Location
- Repository: Ollama
- File:
llama/llama.cpp/src/llama-model.h - Lines: 1-536
Signature
enum llm_type {
LLM_TYPE_UNKNOWN,
LLM_TYPE_14M, LLM_TYPE_17M, /* ... */ LLM_TYPE_671B,
LLM_TYPE_8x7B, LLM_TYPE_8x22B, /* ... MoE types */
};
struct llama_layer {
struct ggml_tensor * attn_norm = nullptr;
struct ggml_tensor * wq = nullptr;
struct ggml_tensor * wk = nullptr;
struct ggml_tensor * wv = nullptr;
struct ggml_tensor * wo = nullptr;
struct ggml_tensor * ffn_gate = nullptr;
struct ggml_tensor * ffn_down = nullptr;
struct ggml_tensor * ffn_up = nullptr;
// ... many more tensor pointers for all architectures
};
struct llama_model {
llama_hparams hparams;
llama_vocab vocab;
std::vector<llama_layer> layers;
struct ggml_tensor * tok_embd = nullptr;
struct ggml_tensor * output = nullptr;
// ...
};
Import
#include "llama-model.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| N/A | N/A | N/A | Header defines types only; populated during model loading |
Outputs
| Name | Type | Description |
|---|---|---|
| llama_model | struct | Complete model with hparams, vocab, layers, and tensors |
| llama_layer | struct | Per-layer tensor pointers for all architecture types |
Usage Examples
#include "llama-model.h"
// Access model data:
const auto & hparams = model.hparams;
const auto & layer = model.layers[il];
ggml_tensor * q = layer.wq;
ggml_tensor * k = layer.wk;