Principle:Ggml org Llama cpp Speculative Decoding

Knowledge Sources	Domains	Last Updated
ggml-org/llama.cpp	Speculative Execution, Draft Models	2026-02-15

Overview

Speculative Decoding is a design principle in the llama.cpp project covering speculative execution and draft models.

See linked implementation pages for concrete usage details.

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment