Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Ggml org Llama cpp Lookup Stats

From Leeroopedia
Knowledge Sources
Domains Speculative_Decoding, Analysis
Last Updated 2026-02-15 00:00 GMT

Overview

Evaluates the effectiveness of prompt lookup decoding by computing acceptance statistics for drafted tokens against the actual model output.

Description

Loads static and dynamic n-gram caches, tokenizes the input, then processes the tokens in context-sized chunks simulating sequential generation. For each position, it drafts candidate tokens using n-gram lookup and compares them against the known-correct next tokens, tracking the number of drafted and accepted tokens. Reports overall acceptance rates and timing statistics for the draft process.

Usage

Use this diagnostic tool to evaluate how well prompt lookup decoding would perform for a specific text and model combination, guiding parameter tuning and cache construction strategies.

Code Reference

Source Location

Signature

int main(int argc, char ** argv);

Import

#include "arg.h"
#include "common.h"
#include "log.h"
#include "ngram-cache.h"
#include "llama.h"
#include "ggml.h"

#include <cstdint>
#include <cstdio>
#include <cinttypes>
#include <fstream>
#include <string>
#include <vector>

I/O Contract

Inputs

Name Type Required Description
-m string Yes Path to the GGUF model file
-p string Yes Prompt text to evaluate lookup decoding against
--lookup-cache-static string No Path to a static n-gram cache file
--lookup-cache-dynamic string No Path to a dynamic n-gram cache file
-n int No Maximum number of draft tokens per step

Outputs

Name Type Description
stdout text Acceptance rate statistics: number of drafted tokens, accepted tokens, and timing
return int Exit code: 0 on success, 1 on failure

Usage Examples

# Evaluate lookup stats with static cache
./build/bin/llama-lookup-stats \
  -m model.gguf \
  -p "Your input text here..." \
  --lookup-cache-static static_cache.bin

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment