Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ollama Ollama Llama KV Cells

From Leeroopedia
Knowledge Sources
Domains LLM Inference, Memory Management
Last Updated 2025-02-15 00:00 GMT

Overview

Header-only implementation of the KV cache cell metadata tracking system, managing per-cell position, sequence assignment, and shift state.

Description

llama_kv_cell_ext stores 2D spatial positions for M-RoPE (x, y coordinates). llama_kv_cells manages arrays of positions, extended metadata, position shifts, and per-cell sequence bitsets using std::bitset<LLAMA_MAX_SEQ>. Provides methods for cell lifecycle operations, sequence management (seq_rm, seq_cp, seq_keep, seq_add, seq_div), position queries (seq_pos_min, seq_pos_max), and maintains a used set for tracking occupied cells with per-sequence position maps for efficient lookups.

Usage

Used internally by llama_kv_cache to track metadata about which cells are occupied, what sequences they belong to, and their positions. This is the low-level bookkeeping layer that enables efficient cache eviction and position updates.

Code Reference

Source Location

  • Repository: Ollama
  • File: llama/llama.cpp/src/llama-kv-cells.h
  • Lines: 1-533

Signature

struct llama_kv_cell_ext {
    llama_pos x = 0;
    llama_pos y = 0;
    bool is_2d_gt(llama_pos ox, llama_pos oy) const;
    void reset();
};

class llama_kv_cells {
public:
    void reset();
    void resize(uint32_t n);
    bool is_empty(uint32_t i) const;
    uint32_t get_used() const;
    uint32_t used_min() const;
    uint32_t used_max_p1() const;
    bool get_has_shift() const;

    llama_kv_cells cp(uint32_t i, uint32_t n) const;
    void set(uint32_t i, const llama_kv_cells & other);

    bool seq_rm(llama_seq_id seq_id, llama_pos p0, llama_pos p1);
    void seq_cp(llama_seq_id seq_id_src, llama_seq_id seq_id_dst, llama_pos p0, llama_pos p1);
    void seq_keep(llama_seq_id seq_id);
    void seq_add(llama_seq_id seq_id, llama_pos p0, llama_pos p1, llama_pos shift);
};

Import

#include "llama-kv-cells.h"

I/O Contract

Inputs

Name Type Required Description
n uint32_t Yes Number of cells to allocate on resize
seq_id llama_seq_id Yes Sequence identifier for operations
p0, p1 llama_pos Yes Position range for sequence operations

Outputs

Name Type Description
get_used() uint32_t Number of occupied cells
is_empty(i) bool Whether cell i is unoccupied
seq_pos_min/max llama_pos Min/max position for a sequence

Usage Examples

llama_kv_cells cells;
cells.resize(4096);

// Check cell state
if (cells.is_empty(0)) {
    // Cell 0 is available
}

// Track occupied cells
uint32_t used = cells.get_used();

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment