Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Anthropics Anthropic sdk python Multi turn Thinking Conversations

From Leeroopedia
Knowledge Sources
Domains Extended_Thinking, LLM, Reasoning
Last Updated 2026-02-15 00:00 GMT

Overview

Multi-turn Thinking Conversations is the principle of preserving the model's reasoning context across multiple conversation turns by echoing back thinking blocks with exact signatures and handling redacted thinking blocks transparently. This ensures that the model can verify the integrity of its prior reasoning and maintain coherent chain-of-thought across a multi-turn dialogue.

Theory: Preserving Reasoning Context Across Turns

In a multi-turn conversation with extended thinking, each assistant response may contain thinking blocks that capture the model's internal reasoning. When the developer sends a follow-up message, the API requires that these thinking blocks be included in the conversation history so the model can:

  • Verify reasoning integrity: The cryptographic signature on each thinking block allows the API to confirm that the thinking content has not been tampered with since it was generated.
  • Maintain coherent reasoning: By seeing its own prior reasoning, the model can build on previous analysis rather than starting from scratch on each turn.
  • Preserve safety invariants: Redacted thinking blocks (where content was safety-filtered) must be echoed back exactly, allowing the API to account for filtered reasoning without exposing it.

The Requirement to Echo Back Thinking Blocks

When constructing the message list for a multi-turn conversation, the developer must include all content blocks from prior assistant responses, including thinking blocks. This creates a round-trip pattern:

  1. First turn: The API returns a Message with ThinkingBlock and TextBlock items in its content list
  2. Preparing next turn: The developer includes the entire content list from the previous response as the content of an assistant message
  3. Second turn: The API receives the echoed thinking blocks, verifies their signatures, and generates a new response informed by the prior reasoning

Signature Integrity

Each ThinkingBlock carries a signature field containing a cryptographic signature. This signature must be preserved exactly when echoing the block back. If the signature is modified, the API will reject the request. If the thinking text is modified but the signature is not, the API will detect the mismatch and reject the request.

This integrity mechanism ensures that:

  • Developers cannot fabricate thinking blocks to influence the model's reasoning
  • The model can trust that the thinking it sees in the history is genuinely its own prior output
  • The conversation history is tamper-evident

Exact Preservation Requirement

Both the thinking text and the signature must be preserved exactly as received. No modifications, truncation, or reformatting is permitted. The simplest pattern is to pass the entire content list from the response directly into the next request's assistant message content.

Handling Redacted Thinking Blocks Transparently

Redacted thinking blocks are a special case that requires careful handling:

  • The original thinking content has been removed by safety filters and is not accessible to the developer
  • The block contains only an opaque data field with encrypted content
  • This data field must be preserved exactly when echoing back in multi-turn conversations
  • The API uses the opaque data to reconstruct its understanding of the filtered reasoning, maintaining conversation continuity

The principle of transparent handling means:

  • The developer does not need to understand or interpret the redacted data
  • The developer simply passes it through unchanged, treating it as an opaque blob
  • The same code path that handles normal thinking blocks can handle redacted thinking blocks -- just include all content blocks from the response

The Conversion Pattern

In practice, the SDK provides two parallel type hierarchies:

  • Response types (ThinkingBlock, RedactedThinkingBlock): Pydantic models returned in API responses
  • Param types (ThinkingBlockParam, RedactedThinkingBlockParam): TypedDict types accepted in API requests

The conversion from response to param is straightforward because both hierarchies share the same field names and types. The Anthropic Python SDK handles this conversion automatically when the developer passes the content list from a Message response into the content field of a MessageParam.

Multi-turn Conversation Pattern

The recommended pattern for multi-turn thinking conversations is:

  1. Send the first user message with thinking enabled
  2. Receive the response containing thinking and text blocks
  3. Construct the next message list by including:
    • The original user message
    • An assistant message with the complete content list from the response (including all thinking and redacted thinking blocks)
    • The new user message
  4. Send the second request with the same thinking configuration

This pattern naturally extends to any number of turns. Each turn's assistant content (including all thinking blocks) must be preserved in the conversation history.

Design Rationale

  • Security by default: The signature verification mechanism prevents prompt injection through fabricated thinking blocks.
  • Simplicity of use: Developers do not need to manually convert between response and param types -- passing the content list directly works.
  • Graceful degradation: Redacted thinking blocks maintain conversation continuity even when safety filters remove reasoning content.
  • Stateless API: The API does not maintain server-side conversation state. All context is passed in each request through the message list, including thinking blocks.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment