Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Anthropics Anthropic sdk python Tool Call Detection

From Leeroopedia
Knowledge Sources
Domains Tool_Use, LLM, Function_Calling
Last Updated 2026-02-15 00:00 GMT

Overview

Tool Call Detection is the process of inspecting a Claude API response to determine whether the model is requesting one or more tool invocations. This is step three in the Tool Use Integration workflow: after sending tool definitions in a request, the application must parse the response to identify tool call requests and route them appropriately.

Theory: Inspecting LLM Responses for Tool Invocations

Claude's response (Message object) contains a content array that can hold multiple heterogeneous blocks. When the model decides to use a tool, it emits tool_use blocks alongside optional text blocks. The application must:

  1. Check the top-level stop_reason field for a quick signal
  2. Iterate over message.content to find blocks of type "tool_use"
  3. Extract the tool name, input arguments, and correlation ID from each block

This two-level detection (stop_reason check plus content filtering) provides both a fast path and a precise extraction mechanism.

The stop_reason Signal

The Message.stop_reason field is a string literal indicating why the model stopped generating:

Value Meaning
"end_turn" Model finished its response naturally (no tool call)
"tool_use" Model wants to invoke one or more tools
"max_tokens" Response was truncated due to token limit
"stop_sequence" A custom stop sequence was matched
"pause_turn" Model paused mid-turn (streaming scenarios)
"refusal" Model refused the request

The type is defined as:

StopReason: TypeAlias = Literal[
    "end_turn", "max_tokens", "stop_sequence", "tool_use", "pause_turn", "refusal"
]

Checking message.stop_reason == "tool_use" is the fastest way to determine if tool processing is needed. However, note that stop_reason can be None in streaming contexts before the response is complete.

Discriminated Union Pattern: Content Block Filtering

The message.content array contains blocks that are discriminated by their type field. Each block is a Pydantic model with a type literal:

  • TextBlock -- type: Literal["text"]
  • ToolUseBlock -- type: Literal["tool_use"]
  • ThinkingBlock -- type: Literal["thinking"]

The standard Python pattern for extracting tool calls is:

tool_use_blocks = [block for block in message.content if block.type == "tool_use"]

Each ToolUseBlock carries three essential fields:

Field Type Purpose
id str Unique identifier for this tool call; used to correlate results
name str Name of the tool the model wants to invoke
input Dict[str, object] The arguments the model generated, conforming to the tool's input schema

Detection Flow

The recommended detection pattern follows this sequence:

  1. Quick check: If message.stop_reason != "tool_use", the model did not request any tools -- process the text response directly
  2. Extract blocks: Filter message.content for type == "tool_use"
  3. Iterate and dispatch: For each ToolUseBlock, look up the tool by name and pass input to the execution function
  4. Collect results: Build ToolResultBlockParam dicts with the matching tool_use_id
if message.stop_reason == "tool_use":
    for block in message.content:
        if block.type == "tool_use":
            # block.id   -> correlation ID
            # block.name -> which tool to call
            # block.input -> arguments dict
            result = dispatch_tool(block.name, block.input)

Mixed Content Responses

A single response can contain both text and tool_use blocks. The model may explain its reasoning in text before requesting a tool call. Applications should handle this by:

  • Displaying or logging any TextBlock content for transparency
  • Processing all ToolUseBlock entries for execution
  • Preserving the entire content array when appending the assistant message to conversation history

Edge Cases

  • Multiple tool calls: The model can emit several tool_use blocks in one response (parallel tool use). Each must be executed and each result must reference its specific tool_use_id.
  • Streaming: In streaming mode, tool_use blocks arrive incrementally. The stop_reason is only available after the stream completes.
  • max_tokens truncation: If stop_reason == "max_tokens", the response may contain a partial or no tool_use block. The application should handle this gracefully.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment