Principle:Anthropics Anthropic sdk python Response Processing

Knowledge Sources	Anthropic Python SDK Anthropic API Docs
Domains	API_Client, LLM
Last Updated	2026-02-15 00:00 GMT

Overview

The Response Processing principle describes how the Anthropic Python SDK parses raw JSON API responses into structured, type-safe Pydantic models. The response model hierarchy uses discriminated unions to represent heterogeneous content blocks, Pydantic's BaseModel for automatic validation, and dedicated types for token usage tracking.

Theoretical Basis

Structured Response Parsing with Pydantic Models

When the API returns a JSON response, the SDK deserializes it into a Message Pydantic model rather than returning a raw dictionary. This provides:

Attribute access -- Fields are accessed as Python attributes (message.content, message.usage.input_tokens) rather than through dictionary key lookups, improving readability and IDE support.
Type validation -- Pydantic validates the response shape against the model schema. If the API returns unexpected data and strict validation is enabled, an APIResponseValidationError is raised.
Immutability by convention -- While Pydantic models are technically mutable, the SDK treats response objects as read-only data carriers.

ContentBlock as a Discriminated Union

The content field of a Message is a list of ContentBlock items. ContentBlock is defined as a discriminated union -- a union type where each variant is identified by a unique value in a shared discriminator field:

ContentBlock = Union[
    TextBlock,              # type = "text"
    ThinkingBlock,          # type = "thinking"
    RedactedThinkingBlock,  # type = "redacted_thinking"
    ToolUseBlock,           # type = "tool_use"
    ServerToolUseBlock,     # type = "server_tool_use"
    WebSearchToolResultBlock # type = "web_search_tool_result"
]

The type field serves as the discriminator. When Pydantic deserializes the JSON array, it inspects each element's "type" value to select the correct union variant. This design:

Avoids try/except parsing -- No need to attempt multiple parsers; the discriminator field deterministically selects the correct model.
Supports extensibility -- New content block types can be added by extending the union without modifying existing variants.
Enables pattern matching -- Callers can switch on the type attribute to handle each variant appropriately.

StopReason as a Literal Union

The stop_reason field uses a TypeAlias of string literals to represent why generation stopped:

"end_turn" -- Natural completion
"max_tokens" -- Token limit reached
"stop_sequence" -- Custom stop sequence matched
"tool_use" -- Model invoked a tool
"pause_turn" -- Long-running turn paused
"refusal" -- Streaming classifier intervention

This is Optional because in streaming mode the initial message_start event has stop_reason=None.

Token Usage Tracking

The Usage model provides comprehensive token accounting for billing and rate limiting:

input_tokens -- Tokens consumed by the input prompt
output_tokens -- Tokens generated in the response
cache_creation_input_tokens -- Tokens used to create a prompt cache entry (optional)
cache_read_input_tokens -- Tokens read from an existing prompt cache entry (optional)

The total input tokens for billing purposes is the sum of input_tokens, cache_creation_input_tokens, and cache_read_input_tokens. Token counts do not directly correspond to visible text because the API applies internal transformations between the request format and the model's native format.

Nested Model Composition

The response model graph is composed through nested Pydantic models:

Message contains Usage and a list of ContentBlock
TextBlock optionally contains a list of TextCitation
ToolUseBlock contains the tool name, ID, and a dict of input parameters

Each level of nesting is independently validated and independently accessible, allowing callers to extract exactly the data they need.

Design Constraints

The Message.role field is always Literal["assistant"] -- the API only returns assistant messages.
The Message.type field is always Literal["message"] -- this is the object type discriminator for the API's polymorphic response envelope.
output_tokens is always non-zero even for empty string responses, because the model's internal processing consumes tokens.
Citation support in TextBlock is optional and depends on whether the request included document sources.

Related Pages

Implemented By

Implementation:Anthropics_Anthropic_sdk_python_Message_Response

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment