Principle:Ggml org Llama cpp Context Extension Testing
| Knowledge Sources | |
|---|---|
| Domains | Context_Extension |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
Context Extension Testing is the principle of validating that models can correctly attend to and retrieve information from extended context windows.
Description
This principle covers testing methodologies for verifying that context extension techniques (such as RoPE scaling, YaRN, or other positional encoding modifications) actually allow the model to effectively use longer contexts. The passkey test is a standard evaluation where a random passkey is embedded in a long document and the model must retrieve it, demonstrating that information at arbitrary positions in the extended context is accessible.
Usage
Apply this principle when evaluating context extension techniques, validating that RoPE scaling parameters are correctly configured, or benchmarking a model's effective context length.
Theoretical Basis
The passkey retrieval test works by constructing a long input document that contains a short random passkey (e.g., a number) embedded at a random position within filler text. The model is then asked to recall the passkey. If the model can correctly retrieve it, this demonstrates that the attention mechanism can effectively access information at that position in the context. By varying the document length and passkey position, this test maps the effective context window of the model. This is a more practical evaluation than perplexity alone, as a model may have low perplexity on long contexts while still failing to retrieve specific details from distant positions.