Implementation:Ggml org Llama cpp Peg Parser
| Knowledge Sources | |
|---|---|
| Domains | Parsing, Grammar |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
Implements a PEG (Parsing Expression Grammar) parser used for constrained generation, supporting partial matching where input may be incomplete as tokens are still being generated.
Description
This module builds a trie for matching multiple literals efficiently. The parser_executor recursively evaluates PEG parsers (sequences, choices, repetitions, literals, character classes, JSON strings, etc.) against input text. It supports three result types: success, fail, and need_more_input (for streaming). The parser integrates with the grammar builder to also generate GBNF rules from the same PEG definitions, and includes specialized parsers for JSON strings, schema validation, and "until" patterns with delimiter exclusion.
Usage
Use this module as the central component of the constrained decoding pipeline, enabling the server to parse and validate partially-generated text against grammar rules in real time, ensuring model outputs conform to specified formats during generation.
Code Reference
Source Location
- Repository: Ggml_org_Llama_cpp
- File: common/peg-parser.cpp
- Lines: 1-1712
Signature
const char * common_peg_parse_result_type_name(common_peg_parse_result_type type);
struct trie {
struct node {
size_t depth = 0;
std::map<unsigned char, size_t> children;
bool is_word;
};
std::vector<node> nodes;
trie(const std::vector<std::string> & words);
enum match_result { NO_MATCH, PARTIAL_MATCH, COMPLETE_MATCH };
match_result check_at(std::string_view sv, size_t start_pos) const;
};
struct parser_executor {
// Recursively evaluates PEG parsers against input
// Returns common_peg_parse_result with fail/success/need_more_input
};
Import
#include "common.h"
#include "peg-parser.h"
#include "json-schema-to-grammar.h"
#include "unicode.h"
#include <nlohmann/json.hpp>
#include <algorithm>
#include <map>
#include <memory>
#include <regex>
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| words | std::vector<std::string> | Yes | List of literal strings to build the trie from |
| sv | std::string_view | Yes | Input text to parse or match against |
| start_pos | size_t | Yes | Position in input where matching should begin |
| parser_id | common_peg_parser_id | Yes | ID of the parser to execute against input |
Outputs
| Name | Type | Description |
|---|---|---|
| result_type | common_peg_parse_result_type | Parse result: fail (0), success (1), or need_more_input (2) |
| match_result | trie::match_result | Trie match result: NO_MATCH, PARTIAL_MATCH, or COMPLETE_MATCH |
| ast | common_peg_ast_node | AST node produced by successful parse |
Usage Examples
#include "peg-parser.h"
// Build a trie for delimiter matching
std::vector<std::string> delimiters = {"</tool_call>", "</function>"};
trie delimiter_trie(delimiters);
// Check for delimiter at position
auto result = delimiter_trie.check_at(input_text, current_pos);
if (result == trie::COMPLETE_MATCH) {
// Found a complete delimiter match
} else if (result == trie::PARTIAL_MATCH) {
// Need more input to determine if delimiter matches
}