Implementation:Ggml org Llama cpp Passkey Example

Knowledge Sources	Ggml_org_Llama_cpp
Domains	Context_Extension, Example
Last Updated	2026-02-15 00:00 GMT

Overview

Evaluates a model's ability to retrieve a hidden passkey from a long context filled with irrelevant "junk" text.

Description

Generates a prompt containing a random passkey (1-50000) hidden at a configurable position among many repetitions of irrelevant sentences ("The grass is green. The sky is blue..."). The model is then asked to recall the passkey. Uses group attention scaling (`grp_attn_n`) to extend the effective context window beyond the model's training length. Processes the long prompt in chunks using KV cache shifting.

Usage

Use this benchmark for testing long-context capabilities, particularly useful for evaluating RoPE scaling and attention mechanisms that extend context length beyond training limits.

Code Reference

Source Location

Repository: Ggml_org_Llama_cpp
File: examples/passkey/passkey.cpp
Lines: 1-274

Signature

static void print_usage(int argc, char ** argv);
int main(int argc, char ** argv);

Import

#include "arg.h"
#include "common.h"
#include "log.h"
#include "llama.h"

#include <cmath>
#include <cstdio>
#include <string>
#include <vector>
#include <algorithm>

I/O Contract

Inputs

Name	Type	Required	Description
-m	string	Yes	Path to the GGUF model file
--junk	int	No	Number of junk text repetitions (default: 250)
--pos	int	No	Position to insert the passkey (-1 for random, default: -1)
--keep	int	No	Number of initial tokens to keep in KV cache (default: 32)
--grp-attn-n	int	No	Group attention factor for context extension (default: 1)
--seed	int	No	Random seed for reproducibility

Outputs

Name	Type	Description
stdout	text	Model's generated response attempting to recall the hidden passkey
return	int	Exit code: 0 on success, 1 on failure

Usage Examples

# Run passkey test with group attention scaling
./build/bin/llama-passkey \
  -m model.gguf \
  --junk 250 \
  --pos 90 \
  --keep 32 \
  --grp-attn-n 2 \
  --seed 1234

Related Pages

Principle:Ggml_org_Llama_cpp_Context_Extension

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment