Principle:Openclaw Openclaw Reply Delivery

Knowledge Sources	OpenClaw
Domains	Agent_Runtime, Messaging
Last Updated	2026-02-06 12:00 GMT

Overview

Reply delivery is the process of transforming raw LLM output into channel-appropriate payloads, managing multi-part message ordering, applying response prefixes and formatting, handling media attachments and voice notes, and dispatching the final messages to the originating channel in the correct sequence.

Description

After the inference layer produces assistant text, tool results, reasoning tokens, and error messages, these raw outputs must be transformed into deliverable messages. This transformation involves several concerns:

Payload construction: The raw assistant output -- which may span multiple text segments, tool metadata entries, reasoning blocks, and error messages -- is assembled into an ordered array of reply payloads. Each payload carries optional text, media URLs, reply-to references, error flags, and audio-as-voice hints. Silent replies (matching the SILENT_REPLY_TOKEN) are filtered out. Duplicate error messages are suppressed when the formatted error text already covers the raw API error.

Reply directive parsing: Assistant text may contain embedded directives (e.g., media URL tags, reply-to tags, audio-as-voice flags) that are extracted during payload construction. These directives allow the LLM to control delivery behavior declaratively within its text output.

Response normalization: Before dispatch, each payload passes through normalization that applies response prefixes (configurable text prepended to the first reply), strips heartbeat tokens, and filters empty/silent payloads. The normalization layer also supports dynamic prefix context (e.g., interpolating the actual model name into the prefix template).

Ordered dispatch: The reply dispatcher serializes outbound messages to preserve the ordering of tool results, block replies, and final replies. It uses a promise chain to ensure messages arrive at the channel in the correct sequence, even when tool results and block replies are emitted concurrently from different stages of the inference loop.

Human-like delay: For block replies (multi-message streaming), the dispatcher can inject configurable random delays between messages to simulate human typing rhythm, making the agent's responses feel more natural in real-time conversations.

Error and idle signaling: The dispatcher provides error handling callbacks for failed deliveries and an idle signal that fires when all pending deliveries have completed. Channel handlers use the idle signal to finalize typing indicators and clean up resources.

Usage

Apply this principle whenever:

Adding support for new media types or delivery modes (e.g., voice notes, reactions).
Modifying how multi-part messages are chunked or ordered.
Changing response prefix behavior or adding new prefix template variables.
Debugging message ordering issues in multi-turn streaming scenarios.
Implementing human-like delay tuning for a specific channel.

Theoretical Basis

Reply delivery implements a pipeline-to-queue pattern:

Payload assembly (pipeline): Raw outputs flow through a linear transformation pipeline: error text extraction, inline tool result formatting, reasoning text formatting, assistant text extraction with directive parsing, and tool error surfacing. Each stage appends to an ordered list of reply items.
Normalization (filter): Each item passes through normalization that applies prefixes, strips tokens, and filters empties. Items that fail normalization are dropped with an optional skip callback.
Dispatch (serialized queue): Normalized payloads enter a promise-chain queue that ensures sequential delivery. The queue tracks pending count and emits idle when drained.

The error suppression logic uses a fingerprint deduplication strategy: raw API error messages are fingerprinted and compared against the formatted user-facing error text to avoid showing the same error twice in different formats.

The dispatcher's send-chain design (a single promise chain for all dispatch kinds) provides a simple concurrency model: even though tool results, block replies, and final replies may be enqueued from different async contexts, they are delivered in enqueue order with no interleaving. This avoids the complexity of a full priority queue while maintaining correct ordering.

Related Pages

Implemented By

Implementation:Openclaw_Openclaw_BuildEmbeddedRunPayloads

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment