Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ggml org Llama cpp Peg Parser

From Leeroopedia
Knowledge Sources
Domains Parsing, Grammar
Last Updated 2026-02-15 00:00 GMT

Overview

Implements a PEG (Parsing Expression Grammar) parser used for constrained generation, supporting partial matching where input may be incomplete as tokens are still being generated.

Description

This module builds a trie for matching multiple literals efficiently. The parser_executor recursively evaluates PEG parsers (sequences, choices, repetitions, literals, character classes, JSON strings, etc.) against input text. It supports three result types: success, fail, and need_more_input (for streaming). The parser integrates with the grammar builder to also generate GBNF rules from the same PEG definitions, and includes specialized parsers for JSON strings, schema validation, and "until" patterns with delimiter exclusion.

Usage

Use this module as the central component of the constrained decoding pipeline, enabling the server to parse and validate partially-generated text against grammar rules in real time, ensuring model outputs conform to specified formats during generation.

Code Reference

Source Location

Signature

const char * common_peg_parse_result_type_name(common_peg_parse_result_type type);

struct trie {
    struct node {
        size_t depth = 0;
        std::map<unsigned char, size_t> children;
        bool is_word;
    };
    std::vector<node> nodes;
    trie(const std::vector<std::string> & words);
    enum match_result { NO_MATCH, PARTIAL_MATCH, COMPLETE_MATCH };
    match_result check_at(std::string_view sv, size_t start_pos) const;
};

struct parser_executor {
    // Recursively evaluates PEG parsers against input
    // Returns common_peg_parse_result with fail/success/need_more_input
};

Import

#include "common.h"
#include "peg-parser.h"
#include "json-schema-to-grammar.h"
#include "unicode.h"
#include <nlohmann/json.hpp>
#include <algorithm>
#include <map>
#include <memory>
#include <regex>

I/O Contract

Inputs

Name Type Required Description
words std::vector<std::string> Yes List of literal strings to build the trie from
sv std::string_view Yes Input text to parse or match against
start_pos size_t Yes Position in input where matching should begin
parser_id common_peg_parser_id Yes ID of the parser to execute against input

Outputs

Name Type Description
result_type common_peg_parse_result_type Parse result: fail (0), success (1), or need_more_input (2)
match_result trie::match_result Trie match result: NO_MATCH, PARTIAL_MATCH, or COMPLETE_MATCH
ast common_peg_ast_node AST node produced by successful parse

Usage Examples

#include "peg-parser.h"

// Build a trie for delimiter matching
std::vector<std::string> delimiters = {"</tool_call>", "</function>"};
trie delimiter_trie(delimiters);

// Check for delimiter at position
auto result = delimiter_trie.check_at(input_text, current_pos);
if (result == trie::COMPLETE_MATCH) {
    // Found a complete delimiter match
} else if (result == trie::PARTIAL_MATCH) {
    // Need more input to determine if delimiter matches
}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment