Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Ggml org Llama cpp Ngram Mod

From Leeroopedia
Knowledge Sources
Domains Speculative_Decoding, Hashing
Last Updated 2026-02-15 00:00 GMT

Overview

Implements a simple hash-table-based n-gram store that maps n-gram token sequences to a single predicted next token.

Description

Uses a fixed-size vector as a hash table with open addressing and no collision handling (entries simply overwrite). The hash function multiplies tokens by the LCG constant 6364136223846793005 and takes modulo of the table size. The `add()` method stores the token following the n-gram at the computed index, `get()` retrieves it (returning -1 if empty), and `reset()` fills all entries with the EMPTY sentinel value (-1).

Usage

Use this module when a lightweight, memory-efficient n-gram lookup is needed for speculative decoding. The simplicity (no collision resolution) trades accuracy for speed and bounded memory, making it suitable for real-time draft token prediction.

Code Reference

Source Location

Signature

common_ngram_mod::common_ngram_mod(uint16_t n, size_t size);
size_t  common_ngram_mod::idx(const entry_t * tokens) const;
void    common_ngram_mod::add(const entry_t * tokens);
entry_t common_ngram_mod::get(const entry_t * tokens) const;
void    common_ngram_mod::reset();
size_t  common_ngram_mod::get_n() const;
size_t  common_ngram_mod::get_used() const;
size_t  common_ngram_mod::size() const;
size_t  common_ngram_mod::size_bytes() const;

Import

#include "ngram-mod.h"

I/O Contract

Inputs

Name Type Required Description
n uint16_t Yes N-gram size (number of tokens to hash)
size size_t Yes Number of entries in the hash table
tokens const entry_t * Yes Pointer to an array of at least n+1 tokens (n for the key, +1 for the value to store via add())

Outputs

Name Type Description
idx return size_t Hash table index for the given n-gram
get return entry_t (int32_t) Predicted next token, or -1 (EMPTY) if not found
get_n return size_t The configured n-gram size
get_used return size_t Number of occupied entries in the hash table
size return size_t Total number of entries in the hash table
size_bytes return size_t Memory usage of the entries vector in bytes

Usage Examples

#include "ngram-mod.h"

// Create a 4-gram hash table with 1M entries
common_ngram_mod ngram(4, 1000000);

// Add an n-gram: tokens[0..3] are the key, tokens[4] is the predicted next token
int32_t tokens[] = {101, 202, 303, 404, 505};
ngram.add(tokens);

// Look up the predicted next token for a key
int32_t key[] = {101, 202, 303, 404};
int32_t prediction = ngram.get(key);  // returns 505

// Reset the table
ngram.reset();

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment