Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:ClickHouse ClickHouse Poco MessageHeader

From Leeroopedia


base/poco/Net/src/MessageHeader.cpp:1-424 ClickHouse_ClickHouse ClickHouse_ClickHouse_HTTP_Client_Communication

Purpose

Implements the `Poco::Net::MessageHeader` class, which handles parsing and serialization of HTTP message headers. This class manages reading RFC 2822-style name-value header fields from input streams, writing them to output streams, splitting header values into elements and parameters, quoting values, and decoding RFC 2047 encoded words. It is the foundation for both `HTTPRequest` and `HTTPResponse` header handling.

Code Reference

Header Serialization (write)

void MessageHeader::write(std::ostream& ostr) const
{
    NameValueCollection::ConstIterator it = begin();
    while (it != end())
    {
        ostr << it->first << ": " << it->second << "\r\n";
        ++it;
    }
}

Header Parsing (read)

Reads headers character by character from a `std::streambuf`, handling field name/value extraction, line folding (continuation lines starting with space or tab), and enforcing configurable limits:

void MessageHeader::read(std::istream& istr)
{
    static const int eof = std::char_traits<char>::eof();
    std::streambuf& buf = *istr.rdbuf();
    std::string name;
    std::string value;
    int ch = buf.sbumpc();
    int fields = 0;
    while (ch != eof && ch != '\r' && ch != '\n')
    {
        if (_fieldLimit > 0 && fields == _fieldLimit)
            throw MessageException("Too many header fields");
        // Parse name up to ':'
        // Parse value up to CRLF
        // Handle line folding (continuation with SP/HT)
        Poco::trimRightInPlace(value);
        add(name, decodeWord(value));
        ++fields;
    }
    istr.putback(ch);
}

Element Splitting (splitElements)

Splits a comma-separated header value (e.g., `Accept`, `Cache-Control`) into individual elements, respecting quoted strings and backslash escapes:

void MessageHeader::splitElements(const std::string& s,
    std::vector<std::string>& elements, bool ignoreEmpty)
{
    // Handles: quoted strings ("..."), backslash escapes, comma delimiters
    // Trims whitespace from each element
}

Parameter Splitting (splitParameters)

Splits a header value into a primary value and semicolon-separated parameters (e.g., `Content-Type: text/html; charset=utf-8`):

void MessageHeader::splitParameters(const std::string& s,
    std::string& value, NameValueCollection& parameters)
{
    // Extract value before first ';'
    // Parse key=value pairs separated by ';'
    // Handles quoted parameter values
}

RFC 2047 Decoding (decodeWord)

Decodes encoded words in the format `=?charset?encoding?text?=` supporting both Base64 (`B`) and Quoted-Printable (`Q`) encodings:

void MessageHeader::decodeRFC2047(const std::string& ins, std::string& outs,
    const std::string& charset_to)
{
    StringTokenizer tokens(ins, "?");
    std::string charset = toUpper(tokens[0]);
    std::string encoding = toUpper(tokens[1]);
    std::string text = tokens[2];
    if (encoding == "B")
    {
        // Base64 decode
    }
    else if (encoding == "Q")
    {
        // Quoted-Printable decode ('_' maps to space, '=' introduces hex pair)
    }
    // Character set conversion if needed
}

I/O Contract

Input Output Side Effects
`std::ostream&` via `write` Serialized headers as `Name: Value\r\n` lines Writes to output stream
`std::istream&` via `read` Populated header collection Reads from input stream; throws `MessageException` on malformed headers or limit violations
`std::string` via `splitElements` `std::vector<std::string>` of elements None
`std::string` via `splitParameters` Primary value + `NameValueCollection` of parameters None
`std::string` via `decodeWord` Decoded string with RFC 2047 words resolved None; character set conversion may silently fall back on unsupported encodings
`std::string` + `std::string&` via `quote` Quoted string appended to result Adds surrounding double-quotes if value contains special characters
`std::string` via `hasToken` `bool` indicating token presence None

Usage Examples

// Writing headers
Poco::Net::MessageHeader header;
header.add("Content-Type", "text/html; charset=utf-8");
header.add("X-Custom", "value");
header.write(outputStream);

// Reading headers from a stream
Poco::Net::MessageHeader header;
header.setFieldLimit(100);
header.setNameLengthLimit(256);
header.setValueLengthLimit(8192);
header.read(inputStream);

// Splitting comma-separated elements
std::vector<std::string> elements;
Poco::Net::MessageHeader::splitElements("gzip, deflate, br", elements, true);
// elements: {"gzip", "deflate", "br"}

// Splitting parameters
std::string value;
Poco::Net::NameValueCollection params;
Poco::Net::MessageHeader::splitParameters(
    "text/html; charset=utf-8; boundary=\"----abc\"", value, params);
// value: "text/html", params: {charset: "utf-8", boundary: "----abc"}

// Token check
bool hasGzip = header.hasToken("Accept-Encoding", "gzip");

Internal Details

  • The `read` method operates directly on the `std::streambuf` (via `sbumpc`) for performance, avoiding the overhead of `std::istream::get`.
  • Three configurable limits protect against malicious input: `_fieldLimit` (default: `DFL_FIELD_LIMIT`), `_nameLengthLimit` (default: `DFL_NAME_LENGTH_LIMIT`), and `_valueLengthLimit` (default: `DFL_VALUE_LENGTH_LIMIT`).
  • Line folding per RFC 2822 is supported: continuation lines beginning with SP or HT are appended to the previous header value.
  • Invalid header lines (lines without a colon) are silently ignored (the parser skips to the next line).
  • The `decodeWord` method iterates to decode multiple RFC 2047 encoded words within a single header value.
  • The `quote` static method only adds quotes if the value contains non-alphanumeric characters other than `.`, `_`, `-`, and optionally spaces.
  • The `hasToken` method performs case-insensitive comparison of comma-separated tokens.

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment