Workflow:Anthropics Anthropic sdk python Structured Output Extraction
| Knowledge Sources | |
|---|---|
| Domains | LLMs, Data_Extraction, Structured_Output |
| Last Updated | 2026-02-15 12:00 GMT |
Overview
End-to-end process for extracting structured, typed data from Claude's responses by parsing them into Pydantic models using the Anthropic Python SDK's parse and streaming parse capabilities.
Description
This workflow demonstrates how to obtain structured, validated output from Claude by specifying a Pydantic BaseModel as the desired output format. The SDK's parse() method and streaming equivalent automatically constrain the model's output to match the provided schema and deserialize the response into a typed Python object. This eliminates manual JSON parsing and validation, providing type-safe access to extracted data. The workflow supports both synchronous one-shot parsing and incremental streaming with partial snapshots.
Usage
Execute this workflow when you need to extract structured data from natural language (e.g., extracting order details, parsing entities, classifying text), when building pipelines that require typed inputs from LLM outputs, or when you want guaranteed schema compliance from Claude's responses.
Execution Steps
Step 1: Output Schema Definition
Define the desired output structure as a Pydantic BaseModel. The model's fields, types, and optional validators define the schema that Claude's response must conform to. Nested models, lists, enums, and optional fields are all supported.
Key considerations:
- Use Pydantic BaseModel with typed fields to define the output structure
- Field names and types directly translate to the JSON Schema sent to the API
- Nested models (BaseModel within BaseModel) create nested object schemas
- Optional fields, default values, and Field validators are respected
Step 2: Parse Request Execution
Call client.messages.parse() instead of client.messages.create(), passing the Pydantic model class as the output_format parameter. The SDK automatically converts the model to a JSON Schema, instructs Claude to produce conforming output, and deserializes the response.
Key considerations:
- The parse() method accepts the same parameters as create() plus output_format
- output_format takes a Pydantic BaseModel class (not an instance)
- The SDK handles JSON Schema generation from the model class automatically
- The returned object is a ParsedMessage with an additional parsed_output field
Step 3: Parsed Output Access
Access the structured result through the parsed_output property of the returned ParsedMessage. This is a fully instantiated Pydantic model with validated, typed fields ready for use in application logic.
Key considerations:
- parsed_output is an instance of the specified Pydantic model
- All Pydantic validation rules apply to the parsed output
- If parsing fails, the raw text response is still accessible through the message content blocks
- The message also contains standard fields: stop_reason, usage, model, etc.
Step 4: Streaming Structured Output
For real-time parsing, use client.messages.stream() with the output_format parameter. During streaming, call stream.parsed_snapshot() on text events to get partially parsed objects that update incrementally as more data arrives.
Key considerations:
- Streaming parse uses the same output_format parameter as non-streaming
- parsed_snapshot() returns a partial Pydantic model instance (fields may be None until populated)
- After the stream completes, get_final_message().parsed_output provides the complete parsed object
- This enables progressive UI updates as structured data fills in