Implementation:Ggml org Llama cpp Chat Parser Header

Knowledge Sources	Ggml_org_Llama_cpp
Domains	Chat, Parsing
Last Updated	2026-02-15 00:00 GMT

Overview

Declares the `common_chat_msg_parser` class that provides a cursor-based API for parsing model-generated text into structured chat messages with tool calls.

Description

The parser maintains an input string, current position, and a result `common_chat_msg` being built. It exposes methods for consuming literals, regex patterns, JSON (with partial support), whitespace, and rest-of-input. Provides `add_content`, `add_reasoning_content`, and `add_tool_call` methods to populate the result. Uses a `healing_marker` (random string) to handle partial/streaming JSON gracefully. The `common_chat_msg_partial_exception` signals incomplete output during streaming. The `consume_json_with_dumped_args` method enables converting specific JSON subtrees to stringified arguments.

Usage

Use this class when implementing format-specific parsing logic for model output. It is the parser interface used by all format-specific implementations in chat-parser.cpp, enabling uniform tool call extraction from diverse model output formats (Llama, DeepSeek, Hermes, Functionary, etc.).

Code Reference

Source Location

Repository: Ggml_org_Llama_cpp
File: common/chat-parser.h
Lines: 1-133

Signature

class common_chat_msg_partial_exception : public std::runtime_error {
  public:
    common_chat_msg_partial_exception(const std::string & message);
};

class common_chat_msg_parser {
    std::string input_;
    bool is_partial_;
    common_chat_parser_params syntax_;
    std::string healing_marker_;
    size_t pos_ = 0;
    common_chat_msg result_;

  public:
    common_chat_msg_parser(const std::string & input, bool is_partial,
        const common_chat_parser_params & syntax);

    const std::string & input() const;
    size_t pos() const;
    const std::string & healing_marker() const;
    const bool & is_partial() const;
    const common_chat_msg & result() const;

    void move_to(size_t pos);
    void move_back(size_t n);
    std::string str(const common_string_range & rng) const;

    void add_content(const std::string & content);
    void add_reasoning_content(const std::string & reasoning_content);
    bool add_tool_call(const std::string & name, const std::string & id,
        const std::string & arguments);
    bool add_tool_call(const nlohmann::ordered_json & tool_call);
    bool add_tool_calls(const nlohmann::ordered_json & arr);
    bool add_tool_call_short_form(const nlohmann::ordered_json & tool_call);

    void finish();
    bool consume_spaces();
    void consume_literal(const std::string & literal);
    bool try_parse_reasoning(const std::string & start_think,
        const std::string & end_think);
    std::string consume_rest();
};

Import

#include "chat.h"
#include "chat-parser-xml-toolcall.h"
#include "json-partial.h"
#include "regex-partial.h"
#include <nlohmann/json_fwd.hpp>
#include <optional>
#include <string>
#include <vector>

I/O Contract

Inputs

Name	Type	Required	Description
input	string	Yes	Raw model-generated text to parse
is_partial	bool	Yes	Whether the input is incomplete (streaming mode)
syntax	common_chat_parser_params	Yes	Parser parameters including format-specific settings

Outputs

Name	Type	Description
result	common_chat_msg	Structured chat message with role, content, reasoning_content, and tool_calls
exception	common_chat_msg_partial_exception	Thrown when input is incomplete during streaming

Usage Examples

#include "chat-parser.h"

// Parse complete model output
common_chat_msg_parser parser(model_output, false, syntax_params);

// Try to parse reasoning block
parser.try_parse_reasoning("<think>", "</think>");

// Consume expected literal prefix
parser.consume_literal("<tool_call>");

// Add extracted tool call
parser.add_tool_call("get_weather", "call_123", "{\"city\": \"Paris\"}");

// Consume remaining content
std::string remaining = parser.consume_rest();
parser.add_content(remaining);

// Finalize and get result
parser.finish();
common_chat_msg msg = parser.result();

Related Pages

Principle:Ggml_org_Llama_cpp_Chat_Parsing

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment