Implementation:ClickHouse ClickHouse Poco MessageHeader
base/poco/Net/src/MessageHeader.cpp:1-424
ClickHouse_ClickHouse
ClickHouse_ClickHouse_HTTP_Client_Communication
Purpose
Implements the `Poco::Net::MessageHeader` class, which handles parsing and serialization of HTTP message headers. This class manages reading RFC 2822-style name-value header fields from input streams, writing them to output streams, splitting header values into elements and parameters, quoting values, and decoding RFC 2047 encoded words. It is the foundation for both `HTTPRequest` and `HTTPResponse` header handling.
Code Reference
Header Serialization (write)
void MessageHeader::write(std::ostream& ostr) const
{
NameValueCollection::ConstIterator it = begin();
while (it != end())
{
ostr << it->first << ": " << it->second << "\r\n";
++it;
}
}
Header Parsing (read)
Reads headers character by character from a `std::streambuf`, handling field name/value extraction, line folding (continuation lines starting with space or tab), and enforcing configurable limits:
void MessageHeader::read(std::istream& istr)
{
static const int eof = std::char_traits<char>::eof();
std::streambuf& buf = *istr.rdbuf();
std::string name;
std::string value;
int ch = buf.sbumpc();
int fields = 0;
while (ch != eof && ch != '\r' && ch != '\n')
{
if (_fieldLimit > 0 && fields == _fieldLimit)
throw MessageException("Too many header fields");
// Parse name up to ':'
// Parse value up to CRLF
// Handle line folding (continuation with SP/HT)
Poco::trimRightInPlace(value);
add(name, decodeWord(value));
++fields;
}
istr.putback(ch);
}
Element Splitting (splitElements)
Splits a comma-separated header value (e.g., `Accept`, `Cache-Control`) into individual elements, respecting quoted strings and backslash escapes:
void MessageHeader::splitElements(const std::string& s,
std::vector<std::string>& elements, bool ignoreEmpty)
{
// Handles: quoted strings ("..."), backslash escapes, comma delimiters
// Trims whitespace from each element
}
Parameter Splitting (splitParameters)
Splits a header value into a primary value and semicolon-separated parameters (e.g., `Content-Type: text/html; charset=utf-8`):
void MessageHeader::splitParameters(const std::string& s,
std::string& value, NameValueCollection& parameters)
{
// Extract value before first ';'
// Parse key=value pairs separated by ';'
// Handles quoted parameter values
}
RFC 2047 Decoding (decodeWord)
Decodes encoded words in the format `=?charset?encoding?text?=` supporting both Base64 (`B`) and Quoted-Printable (`Q`) encodings:
void MessageHeader::decodeRFC2047(const std::string& ins, std::string& outs,
const std::string& charset_to)
{
StringTokenizer tokens(ins, "?");
std::string charset = toUpper(tokens[0]);
std::string encoding = toUpper(tokens[1]);
std::string text = tokens[2];
if (encoding == "B")
{
// Base64 decode
}
else if (encoding == "Q")
{
// Quoted-Printable decode ('_' maps to space, '=' introduces hex pair)
}
// Character set conversion if needed
}
I/O Contract
| Input | Output | Side Effects |
|---|---|---|
| `std::ostream&` via `write` | Serialized headers as `Name: Value\r\n` lines | Writes to output stream |
| `std::istream&` via `read` | Populated header collection | Reads from input stream; throws `MessageException` on malformed headers or limit violations |
| `std::string` via `splitElements` | `std::vector<std::string>` of elements | None |
| `std::string` via `splitParameters` | Primary value + `NameValueCollection` of parameters | None |
| `std::string` via `decodeWord` | Decoded string with RFC 2047 words resolved | None; character set conversion may silently fall back on unsupported encodings |
| `std::string` + `std::string&` via `quote` | Quoted string appended to result | Adds surrounding double-quotes if value contains special characters |
| `std::string` via `hasToken` | `bool` indicating token presence | None |
Usage Examples
// Writing headers
Poco::Net::MessageHeader header;
header.add("Content-Type", "text/html; charset=utf-8");
header.add("X-Custom", "value");
header.write(outputStream);
// Reading headers from a stream
Poco::Net::MessageHeader header;
header.setFieldLimit(100);
header.setNameLengthLimit(256);
header.setValueLengthLimit(8192);
header.read(inputStream);
// Splitting comma-separated elements
std::vector<std::string> elements;
Poco::Net::MessageHeader::splitElements("gzip, deflate, br", elements, true);
// elements: {"gzip", "deflate", "br"}
// Splitting parameters
std::string value;
Poco::Net::NameValueCollection params;
Poco::Net::MessageHeader::splitParameters(
"text/html; charset=utf-8; boundary=\"----abc\"", value, params);
// value: "text/html", params: {charset: "utf-8", boundary: "----abc"}
// Token check
bool hasGzip = header.hasToken("Accept-Encoding", "gzip");
Internal Details
- The `read` method operates directly on the `std::streambuf` (via `sbumpc`) for performance, avoiding the overhead of `std::istream::get`.
- Three configurable limits protect against malicious input: `_fieldLimit` (default: `DFL_FIELD_LIMIT`), `_nameLengthLimit` (default: `DFL_NAME_LENGTH_LIMIT`), and `_valueLengthLimit` (default: `DFL_VALUE_LENGTH_LIMIT`).
- Line folding per RFC 2822 is supported: continuation lines beginning with SP or HT are appended to the previous header value.
- Invalid header lines (lines without a colon) are silently ignored (the parser skips to the next line).
- The `decodeWord` method iterates to decode multiple RFC 2047 encoded words within a single header value.
- The `quote` static method only adds quotes if the value contains non-alphanumeric characters other than `.`, `_`, `-`, and optionally spaces.
- The `hasToken` method performs case-insensitive comparison of comma-separated tokens.