Implementation:Ollama Ollama Llama KV Cells
| Knowledge Sources | |
|---|---|
| Domains | LLM Inference, Memory Management |
| Last Updated | 2025-02-15 00:00 GMT |
Overview
Header-only implementation of the KV cache cell metadata tracking system, managing per-cell position, sequence assignment, and shift state.
Description
llama_kv_cell_ext stores 2D spatial positions for M-RoPE (x, y coordinates). llama_kv_cells manages arrays of positions, extended metadata, position shifts, and per-cell sequence bitsets using std::bitset<LLAMA_MAX_SEQ>. Provides methods for cell lifecycle operations, sequence management (seq_rm, seq_cp, seq_keep, seq_add, seq_div), position queries (seq_pos_min, seq_pos_max), and maintains a used set for tracking occupied cells with per-sequence position maps for efficient lookups.
Usage
Used internally by llama_kv_cache to track metadata about which cells are occupied, what sequences they belong to, and their positions. This is the low-level bookkeeping layer that enables efficient cache eviction and position updates.
Code Reference
Source Location
- Repository: Ollama
- File:
llama/llama.cpp/src/llama-kv-cells.h - Lines: 1-533
Signature
struct llama_kv_cell_ext {
llama_pos x = 0;
llama_pos y = 0;
bool is_2d_gt(llama_pos ox, llama_pos oy) const;
void reset();
};
class llama_kv_cells {
public:
void reset();
void resize(uint32_t n);
bool is_empty(uint32_t i) const;
uint32_t get_used() const;
uint32_t used_min() const;
uint32_t used_max_p1() const;
bool get_has_shift() const;
llama_kv_cells cp(uint32_t i, uint32_t n) const;
void set(uint32_t i, const llama_kv_cells & other);
bool seq_rm(llama_seq_id seq_id, llama_pos p0, llama_pos p1);
void seq_cp(llama_seq_id seq_id_src, llama_seq_id seq_id_dst, llama_pos p0, llama_pos p1);
void seq_keep(llama_seq_id seq_id);
void seq_add(llama_seq_id seq_id, llama_pos p0, llama_pos p1, llama_pos shift);
};
Import
#include "llama-kv-cells.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| n | uint32_t | Yes | Number of cells to allocate on resize |
| seq_id | llama_seq_id | Yes | Sequence identifier for operations |
| p0, p1 | llama_pos | Yes | Position range for sequence operations |
Outputs
| Name | Type | Description |
|---|---|---|
| get_used() | uint32_t | Number of occupied cells |
| is_empty(i) | bool | Whether cell i is unoccupied |
| seq_pos_min/max | llama_pos | Min/max position for a sequence |
Usage Examples
llama_kv_cells cells;
cells.resize(4096);
// Check cell state
if (cells.is_empty(0)) {
// Cell 0 is available
}
// Track occupied cells
uint32_t used = cells.get_used();