Principle:Ggml org Llama cpp Speculative Decoding
Appearance
| Knowledge Sources | Domains | Last Updated |
|---|---|---|
| ggml-org/llama.cpp | Speculative Execution, Draft Models | 2026-02-15 |
Overview
Description
Speculative Decoding is a design principle in the llama.cpp project covering speculative execution and draft models.
Usage
See linked implementation pages for concrete usage details.
Related Pages
- Implementation:Ggml_org_Llama_cpp_Lookahead_Decoding
- Implementation:Ggml_org_Llama_cpp_Lookup_Decoding
- Implementation:Ggml_org_Llama_cpp_Lookup_Stats
- Implementation:Ggml_org_Llama_cpp_Ngram_Cache_Header
- Implementation:Ggml_org_Llama_cpp_Ngram_Map_Header
- Implementation:Ggml_org_Llama_cpp_Ngram_Mod
- Implementation:Ggml_org_Llama_cpp_Ngram_Mod_Header
- Implementation:Ggml_org_Llama_cpp_Speculative_Header
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment