Implementation:Ggml org Llama cpp Chat Parser Header
| Knowledge Sources | |
|---|---|
| Domains | Chat, Parsing |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
Declares the `common_chat_msg_parser` class that provides a cursor-based API for parsing model-generated text into structured chat messages with tool calls.
Description
The parser maintains an input string, current position, and a result `common_chat_msg` being built. It exposes methods for consuming literals, regex patterns, JSON (with partial support), whitespace, and rest-of-input. Provides `add_content`, `add_reasoning_content`, and `add_tool_call` methods to populate the result. Uses a `healing_marker` (random string) to handle partial/streaming JSON gracefully. The `common_chat_msg_partial_exception` signals incomplete output during streaming. The `consume_json_with_dumped_args` method enables converting specific JSON subtrees to stringified arguments.
Usage
Use this class when implementing format-specific parsing logic for model output. It is the parser interface used by all format-specific implementations in chat-parser.cpp, enabling uniform tool call extraction from diverse model output formats (Llama, DeepSeek, Hermes, Functionary, etc.).
Code Reference
Source Location
- Repository: Ggml_org_Llama_cpp
- File: common/chat-parser.h
- Lines: 1-133
Signature
class common_chat_msg_partial_exception : public std::runtime_error {
public:
common_chat_msg_partial_exception(const std::string & message);
};
class common_chat_msg_parser {
std::string input_;
bool is_partial_;
common_chat_parser_params syntax_;
std::string healing_marker_;
size_t pos_ = 0;
common_chat_msg result_;
public:
common_chat_msg_parser(const std::string & input, bool is_partial,
const common_chat_parser_params & syntax);
const std::string & input() const;
size_t pos() const;
const std::string & healing_marker() const;
const bool & is_partial() const;
const common_chat_msg & result() const;
void move_to(size_t pos);
void move_back(size_t n);
std::string str(const common_string_range & rng) const;
void add_content(const std::string & content);
void add_reasoning_content(const std::string & reasoning_content);
bool add_tool_call(const std::string & name, const std::string & id,
const std::string & arguments);
bool add_tool_call(const nlohmann::ordered_json & tool_call);
bool add_tool_calls(const nlohmann::ordered_json & arr);
bool add_tool_call_short_form(const nlohmann::ordered_json & tool_call);
void finish();
bool consume_spaces();
void consume_literal(const std::string & literal);
bool try_parse_reasoning(const std::string & start_think,
const std::string & end_think);
std::string consume_rest();
};
Import
#include "chat.h"
#include "chat-parser-xml-toolcall.h"
#include "json-partial.h"
#include "regex-partial.h"
#include <nlohmann/json_fwd.hpp>
#include <optional>
#include <string>
#include <vector>
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| input | string | Yes | Raw model-generated text to parse |
| is_partial | bool | Yes | Whether the input is incomplete (streaming mode) |
| syntax | common_chat_parser_params | Yes | Parser parameters including format-specific settings |
Outputs
| Name | Type | Description |
|---|---|---|
| result | common_chat_msg | Structured chat message with role, content, reasoning_content, and tool_calls |
| exception | common_chat_msg_partial_exception | Thrown when input is incomplete during streaming |
Usage Examples
#include "chat-parser.h"
// Parse complete model output
common_chat_msg_parser parser(model_output, false, syntax_params);
// Try to parse reasoning block
parser.try_parse_reasoning("<think>", "</think>");
// Consume expected literal prefix
parser.consume_literal("<tool_call>");
// Add extracted tool call
parser.add_tool_call("get_weather", "call_123", "{\"city\": \"Paris\"}");
// Consume remaining content
std::string remaining = parser.consume_rest();
parser.add_content(remaining);
// Finalize and get result
parser.finish();
common_chat_msg msg = parser.result();