Workflow:Anthropics Anthropic sdk python Structured Output Extraction

Knowledge Sources	Anthropic SDK Python Anthropic API Docs
Domains	LLMs, Data_Extraction, Structured_Output
Last Updated	2026-02-15 12:00 GMT

Overview

End-to-end process for extracting structured, typed data from Claude's responses by parsing them into Pydantic models using the Anthropic Python SDK's parse and streaming parse capabilities.

Description

This workflow demonstrates how to obtain structured, validated output from Claude by specifying a Pydantic BaseModel as the desired output format. The SDK's parse() method and streaming equivalent automatically constrain the model's output to match the provided schema and deserialize the response into a typed Python object. This eliminates manual JSON parsing and validation, providing type-safe access to extracted data. The workflow supports both synchronous one-shot parsing and incremental streaming with partial snapshots.

Usage

Execute this workflow when you need to extract structured data from natural language (e.g., extracting order details, parsing entities, classifying text), when building pipelines that require typed inputs from LLM outputs, or when you want guaranteed schema compliance from Claude's responses.

Execution Steps

Step 1: Output Schema Definition

Define the desired output structure as a Pydantic BaseModel. The model's fields, types, and optional validators define the schema that Claude's response must conform to. Nested models, lists, enums, and optional fields are all supported.

Key considerations:

Use Pydantic BaseModel with typed fields to define the output structure
Field names and types directly translate to the JSON Schema sent to the API
Nested models (BaseModel within BaseModel) create nested object schemas
Optional fields, default values, and Field validators are respected

Step 2: Parse Request Execution

Call client.messages.parse() instead of client.messages.create(), passing the Pydantic model class as the output_format parameter. The SDK automatically converts the model to a JSON Schema, instructs Claude to produce conforming output, and deserializes the response.

Key considerations:

The parse() method accepts the same parameters as create() plus output_format
output_format takes a Pydantic BaseModel class (not an instance)
The SDK handles JSON Schema generation from the model class automatically
The returned object is a ParsedMessage with an additional parsed_output field

Step 3: Parsed Output Access

Access the structured result through the parsed_output property of the returned ParsedMessage. This is a fully instantiated Pydantic model with validated, typed fields ready for use in application logic.

Key considerations:

parsed_output is an instance of the specified Pydantic model
All Pydantic validation rules apply to the parsed output
If parsing fails, the raw text response is still accessible through the message content blocks
The message also contains standard fields: stop_reason, usage, model, etc.

Step 4: Streaming Structured Output

For real-time parsing, use client.messages.stream() with the output_format parameter. During streaming, call stream.parsed_snapshot() on text events to get partially parsed objects that update incrementally as more data arrives.

Key considerations:

Streaming parse uses the same output_format parameter as non-streaming
parsed_snapshot() returns a partial Pydantic model instance (fields may be None until populated)
After the stream completes, get_final_message().parsed_output provides the complete parsed object
This enables progressive UI updates as structured data fills in

Execution Diagram

GitHub URL

Workflow Repository