Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Ggml org Llama cpp Grammar Header

From Leeroopedia
Knowledge Sources
Domains Grammar, Constrained_Generation
Last Updated 2026-02-15 00:00 GMT

Overview

Declares the grammar types, parser, and state machine structures for GBNF grammar-constrained generation.

Description

This header defines the `llama_gretype` enum (rule elements: characters, ranges, alternates, references, tokens), `llama_grammar_element` struct, `llama_partial_utf8` for incremental UTF-8 decoding, and `llama_grammar_candidate` for token-grammar matching. The `llama_grammar_parser` class converts GBNF text to rule vectors and provides symbol ID management. The `llama_grammar` struct holds the active grammar state including rules, pushdown stacks, partial UTF-8 state, and lazy trigger configuration (tokens, regex patterns, buffer).

Usage

Include this header when working with grammar-constrained generation. It defines the grammar infrastructure used by the grammar sampler and the server's tool-calling/JSON mode features.

Code Reference

Source Location

Signature

enum llama_gretype { /* LLAMA_GRETYPE_END, _ALT, _RULE_REF, _CHAR, ... */ };

typedef struct llama_grammar_element {
    enum llama_gretype type;
    uint32_t           value;
} llama_grammar_element;

struct llama_partial_utf8 { uint32_t value; int n_remain; };
struct llama_grammar_candidate { size_t index; const uint32_t * code_points; /* ... */ };

struct llama_grammar_parser {
    const llama_vocab * vocab;
    std::map<std::string, uint32_t> symbol_ids;
    llama_grammar_rules rules;
    bool parse(const char * src);
    void print(FILE * file);
};

struct llama_grammar_trigger_pattern {
    std::string pattern;
    std::regex regex;
    size_t find(const std::string & input) const;
};

struct llama_grammar {
    const llama_vocab * vocab;
    const llama_grammar_rules rules;
    llama_grammar_stacks stacks;
    llama_partial_utf8 partial_utf8;
    bool lazy;
    bool awaiting_trigger;
    std::vector<llama_token> trigger_tokens;
    std::vector<llama_grammar_trigger_pattern> trigger_patterns;
};

// Internal API
struct llama_grammar * llama_grammar_init_impl(/* ... */);
void llama_grammar_free_impl(struct llama_grammar * grammar);
struct llama_grammar * llama_grammar_clone_impl(const struct llama_grammar & grammar);
void llama_grammar_apply_impl(const struct llama_grammar & grammar, llama_token_data_array * cur_p);
void llama_grammar_accept_impl(struct llama_grammar & grammar, llama_token token);

Import

#include "llama-grammar.h"
// Dependencies:
#include "llama.h"
#include <map>
#include <regex>
#include <string>
#include <vector>

I/O Contract

Inputs

Name Type Required Description
src const char * Yes GBNF grammar source string for parsing
grammar_str const char * Yes Grammar string for init_impl
grammar_root const char * Yes Root rule name for the grammar
lazy bool No Whether to use lazy grammar triggering
trigger_patterns const char ** No Regex patterns that trigger lazy grammar activation
trigger_tokens const llama_token * No Tokens that trigger lazy grammar activation
cur_p llama_token_data_array * Yes Token candidates to filter via grammar_apply_impl
token llama_token Yes Token to accept into the grammar state

Outputs

Name Type Description
llama_grammar * pointer Initialized grammar state machine
parse return bool Whether GBNF parsing succeeded
cur_p (modified) llama_token_data_array * Token candidates with grammar-invalid tokens zeroed out

Usage Examples

#include "llama-grammar.h"

// Parse a GBNF grammar
llama_grammar_parser parser;
parser.parse("root ::= \"hello\" | \"world\"");

// Initialize grammar state
auto * grammar = llama_grammar_init_impl(vocab, grammar_str, "root",
    false, nullptr, 0, nullptr, 0);

// Apply grammar constraints during sampling
llama_grammar_apply_impl(*grammar, &candidates);

// Accept a token
llama_grammar_accept_impl(*grammar, selected_token);

llama_grammar_free_impl(grammar);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment