Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Anthropics Anthropic sdk python Stream Context Manager Setup

From Leeroopedia
Knowledge Sources
Domains Streaming, LLM
Last Updated 2026-02-15 00:00 GMT

Overview

The Stream Context Manager Setup principle describes the deferred execution pattern used by the Anthropic Python SDK to manage streaming HTTP connections. Instead of immediately dispatching an HTTP request when the user calls the stream() method, the SDK returns a context manager object. The actual network call is deferred until the caller enters the with block (i.e., when __enter__ is invoked). This pattern provides deterministic resource cleanup through the corresponding __exit__ method, ensuring HTTP connections and response bodies are always properly closed regardless of whether the stream completes successfully or an exception occurs.

Core Concepts

Deferred Execution

Deferred execution means that calling client.messages.stream(...) does not immediately open an HTTP connection. Instead, it captures the request parameters into a callable (using functools.partial) and wraps them inside a MessageStreamManager. The HTTP request is only issued when the context manager's __enter__ method is called:

# No HTTP request is made here -- only parameters are captured
manager = client.messages.stream(model="claude-sonnet-4-20250514", max_tokens=1024, messages=[...])

# The HTTP request fires HERE, when entering the `with` block
with manager as stream:
    # `stream` is a live MessageStream connected to the server
    for text in stream.text_stream:
        print(text, end="", flush=True)
# Connection is closed HERE, when exiting the `with` block

This two-phase approach decouples request construction from request execution, giving the caller control over exactly when the network call happens.

Context Manager Protocol

Python's context manager protocol defines two dunder methods:

  • __enter__(self): Called when entering a with block. Returns the object to be bound to the as variable.
  • __exit__(self, exc_type, exc, exc_tb): Called when leaving the with block, whether normally or via exception. Responsible for cleanup.

The SDK uses this protocol at two layers:

  1. MessageStreamManager: The outer context manager returned by .stream(). Its __enter__ fires the HTTP request and constructs a MessageStream. Its __exit__ closes that stream.
  2. MessageStream: The inner stream object itself also implements the context manager protocol (via its own __enter__/__exit__), delegating close to the underlying httpx stream.

Resource Lifecycle Management

Streaming HTTP connections hold open sockets and accumulate response data. Without proper cleanup, these resources can leak. The context manager pattern guarantees cleanup by tying the connection's lifetime to a lexical scope:

[stream() called] --> MessageStreamManager created (no HTTP yet)
        |
        v
[with ... as stream:] --> __enter__: HTTP request fires, MessageStream created
        |
        v
[loop over events] --> SSE events consumed from live connection
        |
        v
[exit with block] --> __exit__: stream.close() -> httpx response closed

This ensures that even if the caller breaks out of the loop early or an exception is raised mid-stream, the underlying HTTP connection is properly terminated.

Sync vs. Async Variants

The SDK provides both synchronous and asynchronous versions of this pattern:

  • Sync: MessageStreamManager.__enter__ calls a synchronous partial function that returns a Stream[RawMessageStreamEvent].
  • Async: AsyncMessageStreamManager.__aenter__ awaits an Awaitable that resolves to an AsyncStream[RawMessageStreamEvent].

The async variant uses __aenter__/__aexit__ and async with syntax, but the principle is identical: defer the network call, scope the connection lifetime.

Why Not Return the Stream Directly?

An alternative design would be to have .stream() immediately make the HTTP call and return a MessageStream. The deferred context manager pattern is preferred for several reasons:

  • Guaranteed cleanup: The with block syntactically guarantees __exit__ is called, preventing connection leaks.
  • Lazy execution: The caller can construct the manager, pass it around, and decide when to actually issue the request.
  • Consistent error scoping: Network errors surface inside the with block where they can be caught and handled, rather than at the point of .stream() construction.

Design Rationale

This pattern follows the same approach used by httpx.Client.stream() and other Python networking libraries. By making resource management explicit through context managers, the SDK avoids a class of bugs where HTTP connections are left open indefinitely. The functools.partial technique for capturing the deferred call is lightweight and avoids the overhead of a coroutine or task scheduler for the sync variant.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment