Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Ollama Ollama Llama Models Registry

From Leeroopedia
Knowledge Sources
Domains LLM Inference, Model Architecture
Last Updated 2025-02-15 00:00 GMT

Overview

Central header file declaring all model architecture graph builder structs for llama.cpp, serving as the registry of all supported LLM architectures.

Description

Declares over 100 llm_build_* structs, each inheriting from llm_graph_context (or specialized bases like llm_graph_context_mamba, llm_build_rwkv6_base, llm_build_rwkv7_base). Each struct has a constructor taking a model and graph parameters, which builds the ggml computation graph for that specific architecture. Also defines shared base classes for Mamba SSM layers, RWKV6, and RWKV7 recurrent models with their specialized graph-building methods.

Usage

Every new model architecture added to the inference engine requires a declaration in this header, making it the core component of the model dispatch system.

Code Reference

Source Location

  • Repository: Ollama
  • File: llama/llama.cpp/src/models/models.h
  • Lines: 1-549

Signature

struct llm_graph_context_mamba : public llm_graph_context {
    llm_graph_context_mamba(const llm_graph_params & params);
    ggml_tensor * build_mamba_layer(llm_graph_input_rs * inp, ggml_tensor * cur, const llama_model & model, const llama_ubatch & ubatch, int il);
    ggml_tensor * build_mamba2_layer(llm_graph_input_rs * inp, ggml_tensor * cur, const llama_model & model, const llama_ubatch & ubatch, int il) const;
};

struct llm_build_rwkv6_base : public llm_graph_context {
    ggml_tensor * build_rwkv6_channel_mix(const llama_layer * layer, ggml_tensor * cur, ggml_tensor * x_prev, llm_arch arch) const;
    ggml_tensor * build_rwkv6_time_mix(llm_graph_input_rs * inp, ggml_tensor * cur, ggml_tensor * x_prev, const llama_ubatch & ubatch, int il) const;
};

// Over 100 architecture builders:
struct llm_build_llama : public llm_graph_context { /* ... */ };
struct llm_build_gemma : public llm_graph_context { /* ... */ };
struct llm_build_qwen2 : public llm_graph_context { /* ... */ };
struct llm_build_deepseek2 : public llm_graph_context { /* ... */ };
// ... etc.

Import

#include "models.h"

I/O Contract

Inputs

Name Type Required Description
model const llama_model & Yes The loaded model with tensors and hyperparameters
params const llm_graph_params & Yes Graph construction parameters

Outputs

Name Type Description
ggml graph ggml_cgraph Complete computation graph for the architecture

Usage Examples

#include "models.h"

// Graph builders are invoked through llama_model::build_graph()
// which dispatches to the correct architecture:
auto graph_builder = llm_build_llama(model, params);
// The constructor builds the full graph

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment