Principle:Anthropics Anthropic sdk python API Request Execution

Knowledge Sources	Anthropic Python SDK Anthropic API Docs
Domains	API_Client, LLM
Last Updated	2026-02-15 00:00 GMT

Overview

The API Request Execution principle describes how the Anthropic Python SDK dispatches HTTP POST requests to the Messages API endpoint, handling parameter transformation, timeout calculation, response parsing, streaming dispatch, and automatic retry with exponential back-off. The Messages.create() method is the central point where typed parameters become a network request and a parsed response.

Theoretical Basis

HTTP POST with Typed Parameter Transformation

The Messages.create() method accepts keyword arguments that match the MessageCreateParams TypedDict shape. Before the request leaves the process, the SDK applies maybe_transform() to convert the Python-typed parameter dict into its JSON-wire-format equivalent. This transformation:

Strips fields set to the sentinel omit value (the SDK's representation of "not provided")
Materializes Iterable types into concrete lists
Recursively transforms nested TypedDicts

The resulting dict is passed as the JSON body of an HTTP POST to /v1/messages.

Non-Streaming vs Streaming Dispatch

The stream parameter controls the return type through Python's @overload mechanism:

stream=False (default) -- The SDK sends a standard POST, waits for the complete response, and parses it into a Message Pydantic model.
stream=True -- The SDK sends the same POST but with "stream": true in the JSON body and wraps the response in a Stream[RawMessageStreamEvent] object that yields server-sent events incrementally.

The type system enforces this at the caller's site: the three @overload signatures ensure that stream=Literal[False] returns Message, stream=Literal[True] returns Stream[RawMessageStreamEvent], and stream=bool returns the union.

Automatic Timeout Adjustment

For non-streaming requests, the SDK implements intelligent timeout scaling. When the user has not provided an explicit timeout and the client is using the default timeout (10 minutes), the SDK calls _calculate_nonstreaming_timeout() with the requested max_tokens and a model-specific token rate from MODEL_NONSTREAMING_TOKENS. This accounts for the fact that:

Non-streaming requests must complete fully before any data is returned
Larger max_tokens values require longer wall-clock time
Certain models (e.g., claude-opus-4-20250514) have known token generation rates

This ensures that long-generation requests do not spuriously time out while still maintaining tight timeouts for small requests.

Automatic Retry with Exponential Back-off

The SDK inherits retry behavior from the base client. Transient errors (HTTP 429 rate limits, 5xx server errors, connection failures) trigger automatic retries up to max_retries times (default 2). The retry delay follows exponential back-off:

Initial delay: 0.5 seconds
Maximum delay: 8.0 seconds
Jitter is applied to prevent thundering herd

The retry count and delay parameters are configured at the client level and can be overridden per-request through extra_headers or by using client.with_options(max_retries=N).

Deprecated Model Warnings

The create() method checks the requested model against a DEPRECATED_MODELS dictionary. If the model is scheduled for end-of-life, a DeprecationWarning is emitted with the deprecation date and a link to migration documentation. This provides a proactive migration signal without breaking existing code.

Design Constraints

The @required_args decorator enforces at runtime that max_tokens, messages, and model are always provided, complementing the static type checking.
The endpoint is always /v1/messages -- there is no per-model endpoint routing.
The cast_to=Message parameter instructs the base client to parse the JSON response into the Message Pydantic model.
When stream=True, the stream_cls=Stream[RawMessageStreamEvent] parameter tells the base client which stream wrapper to use.

Related Pages

Implemented By

Implementation:Anthropics_Anthropic_sdk_python_Messages_Create

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment