Principle:Anthropics Anthropic sdk python Tool Call Detection
| Knowledge Sources | |
|---|---|
| Domains | Tool_Use, LLM, Function_Calling |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
Tool Call Detection is the process of inspecting a Claude API response to determine whether the model is requesting one or more tool invocations. This is step three in the Tool Use Integration workflow: after sending tool definitions in a request, the application must parse the response to identify tool call requests and route them appropriately.
Theory: Inspecting LLM Responses for Tool Invocations
Claude's response (Message object) contains a content array that can hold multiple heterogeneous blocks. When the model decides to use a tool, it emits tool_use blocks alongside optional text blocks. The application must:
- Check the top-level
stop_reasonfield for a quick signal - Iterate over
message.contentto find blocks of type"tool_use" - Extract the tool name, input arguments, and correlation ID from each block
This two-level detection (stop_reason check plus content filtering) provides both a fast path and a precise extraction mechanism.
The stop_reason Signal
The Message.stop_reason field is a string literal indicating why the model stopped generating:
| Value | Meaning |
|---|---|
"end_turn" |
Model finished its response naturally (no tool call) |
"tool_use" |
Model wants to invoke one or more tools |
"max_tokens" |
Response was truncated due to token limit |
"stop_sequence" |
A custom stop sequence was matched |
"pause_turn" |
Model paused mid-turn (streaming scenarios) |
"refusal" |
Model refused the request |
The type is defined as:
StopReason: TypeAlias = Literal[
"end_turn", "max_tokens", "stop_sequence", "tool_use", "pause_turn", "refusal"
]
Checking message.stop_reason == "tool_use" is the fastest way to determine if tool processing is needed. However, note that stop_reason can be None in streaming contexts before the response is complete.
Discriminated Union Pattern: Content Block Filtering
The message.content array contains blocks that are discriminated by their type field. Each block is a Pydantic model with a type literal:
TextBlock--type: Literal["text"]ToolUseBlock--type: Literal["tool_use"]ThinkingBlock--type: Literal["thinking"]
The standard Python pattern for extracting tool calls is:
tool_use_blocks = [block for block in message.content if block.type == "tool_use"]
Each ToolUseBlock carries three essential fields:
| Field | Type | Purpose |
|---|---|---|
id |
str |
Unique identifier for this tool call; used to correlate results |
name |
str |
Name of the tool the model wants to invoke |
input |
Dict[str, object] |
The arguments the model generated, conforming to the tool's input schema |
Detection Flow
The recommended detection pattern follows this sequence:
- Quick check: If
message.stop_reason != "tool_use", the model did not request any tools -- process the text response directly - Extract blocks: Filter
message.contentfortype == "tool_use" - Iterate and dispatch: For each
ToolUseBlock, look up the tool bynameand passinputto the execution function - Collect results: Build
ToolResultBlockParamdicts with the matchingtool_use_id
if message.stop_reason == "tool_use":
for block in message.content:
if block.type == "tool_use":
# block.id -> correlation ID
# block.name -> which tool to call
# block.input -> arguments dict
result = dispatch_tool(block.name, block.input)
Mixed Content Responses
A single response can contain both text and tool_use blocks. The model may explain its reasoning in text before requesting a tool call. Applications should handle this by:
- Displaying or logging any
TextBlockcontent for transparency - Processing all
ToolUseBlockentries for execution - Preserving the entire
contentarray when appending the assistant message to conversation history
Edge Cases
- Multiple tool calls: The model can emit several
tool_useblocks in one response (parallel tool use). Each must be executed and each result must reference its specifictool_use_id. - Streaming: In streaming mode, tool_use blocks arrive incrementally. The
stop_reasonis only available after the stream completes. - max_tokens truncation: If
stop_reason == "max_tokens", the response may contain a partial or no tool_use block. The application should handle this gracefully.