Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ollama Ollama Llama Model Header

From Leeroopedia
Knowledge Sources
Domains LLM Inference, Model Loading
Last Updated 2025-02-15 00:00 GMT

Overview

Header declaring the llama_model struct, layer structures, and model size type enumeration for all supported LLM architectures.

Description

Defines llm_type enum with entries for every supported model size (14M through 671B+ and MoE types). Declares layer structures: llama_layer with tensor pointers for all possible layer components (attention Q/K/V/O, FFN gate/up/down, normalization, MoE expert weights, SSM states, etc.), plus specialized structures for PosNet, ConvNext, ShortConv, and NextN layers. llama_model aggregates architecture, hyperparameters, vocabulary, layers vector, embeddings, memory mappings, and backend devices.

Usage

Include this header when working with the model structure, accessing layer tensors, or implementing new architectures.

Code Reference

Source Location

  • Repository: Ollama
  • File: llama/llama.cpp/src/llama-model.h
  • Lines: 1-536

Signature

enum llm_type {
    LLM_TYPE_UNKNOWN,
    LLM_TYPE_14M, LLM_TYPE_17M, /* ... */ LLM_TYPE_671B,
    LLM_TYPE_8x7B, LLM_TYPE_8x22B, /* ... MoE types */
};

struct llama_layer {
    struct ggml_tensor * attn_norm       = nullptr;
    struct ggml_tensor * wq              = nullptr;
    struct ggml_tensor * wk              = nullptr;
    struct ggml_tensor * wv              = nullptr;
    struct ggml_tensor * wo              = nullptr;
    struct ggml_tensor * ffn_gate        = nullptr;
    struct ggml_tensor * ffn_down        = nullptr;
    struct ggml_tensor * ffn_up          = nullptr;
    // ... many more tensor pointers for all architectures
};

struct llama_model {
    llama_hparams hparams;
    llama_vocab   vocab;
    std::vector<llama_layer> layers;
    struct ggml_tensor * tok_embd  = nullptr;
    struct ggml_tensor * output    = nullptr;
    // ...
};

Import

#include "llama-model.h"

I/O Contract

Inputs

Name Type Required Description
N/A N/A N/A Header defines types only; populated during model loading

Outputs

Name Type Description
llama_model struct Complete model with hparams, vocab, layers, and tensors
llama_layer struct Per-layer tensor pointers for all architecture types

Usage Examples

#include "llama-model.h"

// Access model data:
const auto & hparams = model.hparams;
const auto & layer = model.layers[il];
ggml_tensor * q = layer.wq;
ggml_tensor * k = layer.wk;

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment