Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ggml org Llama cpp Chat Parser

From Leeroopedia
Revision as of 12:38, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Ggml_org_Llama_cpp_Chat_Parser.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Chat, Parsing
Last Updated 2026-02-15 00:00 GMT

Overview

Implements the model output parser that extracts tool calls, content, and reasoning from LLM-generated text across many model-specific formats.

Description

This file is central to the llama.cpp chat infrastructure. The common_chat_msg_parser class maintains a cursor position over the input string, providing methods to consume literals, regex patterns, JSON objects, and whitespace. It includes multiple parsing strategies: parse_json_tool_calls for regex-delimited JSON blocks, parse_prefixed_json_tool_call_array for JSON arrays with a prefix, and format-specific parsers for DeepSeek, Llama 3.x, Hermes, Functionary, MiniMax, Granite, and others. It handles partial/streaming output via healing markers (random IDs used to detect truncation boundaries).

Usage

Used by the llama.cpp server to convert raw model output into structured tool call objects. Essential for providing streaming function calling responses across all supported chat formats.

Code Reference

Source Location

Signature

static void parse_prefixed_json_tool_call_array(
    common_chat_msg_parser & builder,
    const common_regex &     prefix,
    size_t                   rstrip_prefix = 0);

static std::string wrap_code_as_arguments(
    common_chat_msg_parser & builder,
    const std::string & code);

static void parse_json_tool_calls(
    common_chat_msg_parser &            builder,
    const std::optional<common_regex> & block_open,
    const std::optional<common_regex> & function_regex_start_only,
    const std::optional<common_regex> & function_regex,
    const common_regex &                close_regex,
    const std::optional<common_regex> & block_close,
    bool                                allow_raw_python = false,
    const std::function<std::string(...)> & get_function_name = nullptr);

Import

#include "chat-parser.h"

I/O Contract

Inputs

Name Type Required Description
builder common_chat_msg_parser & Yes Parser state containing the model output text and cursor position
prefix/regex common_regex Yes Regex patterns to match tool call boundaries in the output text
is_partial bool No Whether the input is a partial streaming response (enables healing markers)

Outputs

Name Type Description
tool_calls vector<common_chat_tool_call> Extracted tool calls with function names and JSON argument strings
content string Non-tool-call text content from the model response
reasoning string Reasoning/thinking text if the model outputs it separately

Usage Examples

#include "chat-parser.h"

// The parser is typically used by the server's chat completion handler:
common_chat_msg_parser parser(model_output, is_partial);

// Different formats use different parsing strategies:
// For Llama 3.x style: parse_json_tool_calls with <|python_tag|> prefix
// For DeepSeek style: parse with <tool_call> delimiters
// For Hermes style: parse with <tool_call> JSON blocks

// After parsing, extract results:
auto & msg = parser.result();
for (const auto & tc : msg.tool_calls) {
    printf("Function: %s, Args: %s\n", tc.name.c_str(), tc.arguments.c_str());
}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment