Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:ClickHouse ClickHouse Poco MultipartReader

From Leeroopedia


base/poco/Net/src/MultipartReader.cpp:1-305 ClickHouse_ClickHouse ClickHouse_ClickHouse_MIME_Multipart_Processing

Purpose

Implements the `Poco::Net::MultipartReader` and its supporting classes (`MultipartStreamBuf`, `MultipartIOS`, `MultipartInputStream`) for reading MIME multipart messages as defined by RFC 2046. The reader parses boundary-delimited message parts from an input stream, providing access to each part's headers and body as a sub-stream.

Code Reference

MultipartStreamBuf -- Boundary-Aware Reading

The core parsing logic resides in `readFromDevice`, which reads data from the underlying stream and detects boundary lines to delimit parts:

int MultipartStreamBuf::readFromDevice(char* buffer, std::streamsize length)
{
    poco_assert(!_boundary.empty() && _boundary.length() < length - 6);

    static const int eof = std::char_traits<char>::eof();
    std::streambuf& buf = *_istr.rdbuf();

    int n = 0;
    int ch = buf.sbumpc();
    if (ch == eof) return -1;
    *buffer++ = (char) ch; ++n;
    if (ch == '\n' || (ch == '\r' && buf.sgetc() == '\n'))
    {
        // After newline, check for "--" + boundary
        // If boundary matches and followed by CRLF: return 0 (next part)
        // If boundary matches and followed by "--": set _lastPart, return 0
    }
    // Otherwise read until next newline
    return n;
}

MultipartReader -- Part Iteration

void MultipartReader::nextPart(MessageHeader& messageHeader)
{
    if (!_pMPI)
    {
        if (_boundary.empty())
            guessBoundary();
        else
            findFirstBoundary();
    }
    else if (_pMPI->lastPart())
    {
        throw MultipartException("No more parts available");
    }
    parseHeader(messageHeader);
    _pMPI = std::make_unique<MultipartInputStream>(_istr, _boundary);
}

bool MultipartReader::hasNextPart()
{
    return (!_pMPI || !_pMPI->lastPart()) && _istr.good();
}

Boundary Discovery

If no boundary is provided, `guessBoundary` reads it from the first line of the stream:

void MultipartReader::guessBoundary()
{
    static const int eof = std::char_traits<char>::eof();
    int ch = _istr.get();
    while (Poco::Ascii::isSpace(ch))
        ch = _istr.get();
    if (ch == '-' && _istr.peek() == '-')
    {
        _istr.get();
        ch = _istr.peek();
        while (ch != eof && ch != '\r' && ch != '\n' && _boundary.size() < 128)
        {
            _boundary += (char) _istr.get();
            ch = _istr.peek();
        }
        // validate and consume line ending
    }
    else throw MultipartException("No boundary line found");
}

First Boundary Search

void MultipartReader::findFirstBoundary()
{
    std::string expect("--");
    expect.append(_boundary);
    std::string line;
    bool ok = true;
    do
    {
        ok = readLine(line, expect.length());
    }
    while (ok && line != expect);
    if (!ok) throw MultipartException("No boundary line found");
}

I/O Contract

Input Output Side Effects
`std::istream&` + optional `boundary` string `MultipartReader` object None
`MessageHeader&` via `nextPart` Populated header for the next part Advances stream past boundary and part headers; creates new `MultipartInputStream`; throws `MultipartException` if no more parts
`hasNextPart` `bool` None (read-only check)
`stream` `std::istream&` reference to current part body Throws if `nextPart` has not been called
`boundary` `const std::string&` None (accessor)

Usage Examples

// Reading a multipart message with known boundary
std::istringstream body(multipartData);
Poco::Net::MultipartReader reader(body, "----boundary123");

while (reader.hasNextPart())
{
    Poco::Net::MessageHeader partHeader;
    reader.nextPart(partHeader);

    std::string contentType = partHeader.get("Content-Type", "");
    std::istream& partStream = reader.stream();

    // Read part body
    std::string partBody;
    Poco::StreamCopier::copyToString(partStream, partBody);
}

// Auto-detecting boundary from the stream
std::istringstream body2(multipartData);
Poco::Net::MultipartReader reader2(body2);
// boundary is guessed from the first line

Internal Details

  • `MultipartStreamBuf` extends `Poco::BufferedStreamBuf` with a fixed buffer size of `STREAM_BUFFER_SIZE`. It reads one character at a time from the underlying `std::streambuf` to detect boundary lines.
  • The boundary detection algorithm checks for `\r\n--boundary` (or `\n--boundary`) at line beginnings. If the boundary is followed by `\r\n`, a new part begins. If followed by `--`, it is the closing boundary.
  • The `readLine` helper limits line length to 1024 characters to prevent excessive memory usage from malformed input.
  • `guessBoundary` accepts boundaries up to 128 characters, exceeding the RFC 2046 recommendation of 70 characters for compatibility.
  • A `MultipartInputStream` is created for each part, wrapping the underlying stream with the boundary-aware `MultipartStreamBuf`. When the stream buffer detects a boundary, it returns 0 bytes, causing the stream to reach EOF for that part.
  • The `_lastPart` flag is set when the closing boundary (`--boundary--`) is detected, preventing further iteration.
  • The `parseHeader` method delegates to `MessageHeader::read` and consumes the blank line separator between headers and body.

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment