Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Workflow:Anthropics Anthropic sdk python Streaming Message Interaction

From Leeroopedia
Knowledge Sources
Domains LLMs, Streaming, Real_Time_Processing
Last Updated 2026-02-15 12:00 GMT

Overview

End-to-end process for streaming Claude responses in real-time using Server-Sent Events (SSE) via the Anthropic Python SDK streaming helpers.

Description

This workflow demonstrates how to receive Claude's responses incrementally as they are generated, rather than waiting for the complete response. The SDK provides a high-level streaming interface through context managers (MessageStreamManager) that wrap raw SSE events into typed event objects. It supports both synchronous and asynchronous streaming, with convenience accessors like text_stream for simple text output and full event iteration for advanced use cases including tool input streaming and thinking content.

Usage

Execute this workflow when building interactive applications that require real-time response display (chatbots, IDEs, CLI tools), when handling long responses where waiting for completion would degrade user experience, or when you need to process partial results as they arrive (e.g., streaming tool input JSON deltas).

Execution Steps

Step 1: Client Initialization

Create an Anthropic client instance (synchronous or asynchronous). For streaming use cases, the AsyncAnthropic client is often preferred as it allows non-blocking I/O while processing stream events.

Key considerations:

  • AsyncAnthropic is recommended for streaming in async applications
  • The standard Anthropic client also supports streaming via synchronous iteration
  • Client configuration (retries, timeouts) applies to the initial connection; stream reading has its own timeout behavior

Step 2: Stream Context Manager Setup

Initiate a streaming request using the client.messages.stream() method as a context manager. This returns a MessageStreamManager that yields a MessageStream object. The stream begins receiving SSE events immediately upon entering the context.

Key considerations:

  • Use `with client.messages.stream(...)` for sync or `async with client.messages.stream(...)` for async
  • The stream method accepts the same parameters as create() (model, max_tokens, messages, etc.)
  • The context manager ensures proper cleanup of the HTTP connection on exit
  • The stream can be closed early via stream.close() to abort generation

Step 3: Event Iteration

Iterate over the stream events to process response content as it arrives. Events are typed objects that indicate what kind of content is being delivered: text deltas, tool use input JSON, thinking content, content block boundaries, and message-level events.

Key considerations:

  • Use `for event in stream` (sync) or `async for event in stream` (async) for full event access
  • Event types include: "text", "input_json", "thinking", "content_block_start", "content_block_stop", "message_start", "message_delta", "message_stop"
  • Text events provide both delta (incremental text) and snapshot (accumulated text so far)
  • For simple text-only streaming, use the stream.text_stream property as a shortcut

Step 4: Real_time Content Processing

Handle each event type appropriately for the application. Text deltas are typically printed or displayed immediately. Tool input JSON deltas can be accumulated for partial parsing. Thinking content can be displayed separately to show the model's reasoning process.

Key considerations:

  • Print text deltas with flush=True for immediate display
  • Tool input JSON arrives as partial_json (delta) and snapshot (accumulated) on "input_json" events
  • Thinking events only appear when the thinking parameter is enabled in the request
  • Content block start/stop events delineate different content types in the response

Step 5: Final Message Retrieval

After the stream completes (all events consumed), retrieve the fully accumulated message object using stream.get_final_message(). This provides the complete Message object identical to what a non-streaming create() call would return.

Key considerations:

  • get_final_message() must be called after the stream is fully consumed
  • get_final_text() is a convenience method that concatenates all text content blocks
  • The final message includes complete usage statistics (input_tokens, output_tokens)
  • The stream context manager must still be active when calling these methods

Execution Diagram

GitHub URL

Workflow Repository