Implementation:Ollama Ollama Llama Models Registry
| Knowledge Sources | |
|---|---|
| Domains | LLM Inference, Model Architecture |
| Last Updated | 2025-02-15 00:00 GMT |
Overview
Central header file declaring all model architecture graph builder structs for llama.cpp, serving as the registry of all supported LLM architectures.
Description
Declares over 100 llm_build_* structs, each inheriting from llm_graph_context (or specialized bases like llm_graph_context_mamba, llm_build_rwkv6_base, llm_build_rwkv7_base). Each struct has a constructor taking a model and graph parameters, which builds the ggml computation graph for that specific architecture. Also defines shared base classes for Mamba SSM layers, RWKV6, and RWKV7 recurrent models with their specialized graph-building methods.
Usage
Every new model architecture added to the inference engine requires a declaration in this header, making it the core component of the model dispatch system.
Code Reference
Source Location
- Repository: Ollama
- File:
llama/llama.cpp/src/models/models.h - Lines: 1-549
Signature
struct llm_graph_context_mamba : public llm_graph_context {
llm_graph_context_mamba(const llm_graph_params & params);
ggml_tensor * build_mamba_layer(llm_graph_input_rs * inp, ggml_tensor * cur, const llama_model & model, const llama_ubatch & ubatch, int il);
ggml_tensor * build_mamba2_layer(llm_graph_input_rs * inp, ggml_tensor * cur, const llama_model & model, const llama_ubatch & ubatch, int il) const;
};
struct llm_build_rwkv6_base : public llm_graph_context {
ggml_tensor * build_rwkv6_channel_mix(const llama_layer * layer, ggml_tensor * cur, ggml_tensor * x_prev, llm_arch arch) const;
ggml_tensor * build_rwkv6_time_mix(llm_graph_input_rs * inp, ggml_tensor * cur, ggml_tensor * x_prev, const llama_ubatch & ubatch, int il) const;
};
// Over 100 architecture builders:
struct llm_build_llama : public llm_graph_context { /* ... */ };
struct llm_build_gemma : public llm_graph_context { /* ... */ };
struct llm_build_qwen2 : public llm_graph_context { /* ... */ };
struct llm_build_deepseek2 : public llm_graph_context { /* ... */ };
// ... etc.
Import
#include "models.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model | const llama_model & | Yes | The loaded model with tensors and hyperparameters |
| params | const llm_graph_params & | Yes | Graph construction parameters |
Outputs
| Name | Type | Description |
|---|---|---|
| ggml graph | ggml_cgraph | Complete computation graph for the architecture |
Usage Examples
#include "models.h"
// Graph builders are invoked through llama_model::build_graph()
// which dispatches to the correct architecture:
auto graph_builder = llm_build_llama(model, params);
// The constructor builds the full graph