Principle:Anthropics Anthropic sdk python Stream Context Manager Setup
| Knowledge Sources | |
|---|---|
| Domains | Streaming, LLM |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
The Stream Context Manager Setup principle describes the deferred execution pattern used by the Anthropic Python SDK to manage streaming HTTP connections. Instead of immediately dispatching an HTTP request when the user calls the stream() method, the SDK returns a context manager object. The actual network call is deferred until the caller enters the with block (i.e., when __enter__ is invoked). This pattern provides deterministic resource cleanup through the corresponding __exit__ method, ensuring HTTP connections and response bodies are always properly closed regardless of whether the stream completes successfully or an exception occurs.
Core Concepts
Deferred Execution
Deferred execution means that calling client.messages.stream(...) does not immediately open an HTTP connection. Instead, it captures the request parameters into a callable (using functools.partial) and wraps them inside a MessageStreamManager. The HTTP request is only issued when the context manager's __enter__ method is called:
# No HTTP request is made here -- only parameters are captured
manager = client.messages.stream(model="claude-sonnet-4-20250514", max_tokens=1024, messages=[...])
# The HTTP request fires HERE, when entering the `with` block
with manager as stream:
# `stream` is a live MessageStream connected to the server
for text in stream.text_stream:
print(text, end="", flush=True)
# Connection is closed HERE, when exiting the `with` block
This two-phase approach decouples request construction from request execution, giving the caller control over exactly when the network call happens.
Context Manager Protocol
Python's context manager protocol defines two dunder methods:
__enter__(self): Called when entering awithblock. Returns the object to be bound to theasvariable.__exit__(self, exc_type, exc, exc_tb): Called when leaving thewithblock, whether normally or via exception. Responsible for cleanup.
The SDK uses this protocol at two layers:
MessageStreamManager: The outer context manager returned by.stream(). Its__enter__fires the HTTP request and constructs aMessageStream. Its__exit__closes that stream.MessageStream: The inner stream object itself also implements the context manager protocol (via its own__enter__/__exit__), delegating close to the underlyinghttpxstream.
Resource Lifecycle Management
Streaming HTTP connections hold open sockets and accumulate response data. Without proper cleanup, these resources can leak. The context manager pattern guarantees cleanup by tying the connection's lifetime to a lexical scope:
[stream() called] --> MessageStreamManager created (no HTTP yet)
|
v
[with ... as stream:] --> __enter__: HTTP request fires, MessageStream created
|
v
[loop over events] --> SSE events consumed from live connection
|
v
[exit with block] --> __exit__: stream.close() -> httpx response closed
This ensures that even if the caller breaks out of the loop early or an exception is raised mid-stream, the underlying HTTP connection is properly terminated.
Sync vs. Async Variants
The SDK provides both synchronous and asynchronous versions of this pattern:
- Sync:
MessageStreamManager.__enter__calls a synchronouspartialfunction that returns aStream[RawMessageStreamEvent]. - Async:
AsyncMessageStreamManager.__aenter__awaits anAwaitablethat resolves to anAsyncStream[RawMessageStreamEvent].
The async variant uses __aenter__/__aexit__ and async with syntax, but the principle is identical: defer the network call, scope the connection lifetime.
Why Not Return the Stream Directly?
An alternative design would be to have .stream() immediately make the HTTP call and return a MessageStream. The deferred context manager pattern is preferred for several reasons:
- Guaranteed cleanup: The
withblock syntactically guarantees__exit__is called, preventing connection leaks. - Lazy execution: The caller can construct the manager, pass it around, and decide when to actually issue the request.
- Consistent error scoping: Network errors surface inside the
withblock where they can be caught and handled, rather than at the point of.stream()construction.
Design Rationale
This pattern follows the same approach used by httpx.Client.stream() and other Python networking libraries. By making resource management explicit through context managers, the SDK avoids a class of bugs where HTTP connections are left open indefinitely. The functools.partial technique for capturing the deferred call is lightweight and avoids the overhead of a coroutine or task scheduler for the sync variant.