Principle:Anthropics Anthropic sdk python Automated Tool Loop
| Knowledge Sources | |
|---|---|
| Domains | Tool_Use, LLM, Function_Calling |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
The Automated Tool Loop is an agent-style pattern that orchestrates the entire tool use cycle -- request, detect, execute, submit -- in an automated loop. Instead of manually writing the iteration logic for each tool call round-trip, the SDK provides a runner that handles the complete lifecycle, iterating until the model produces a final response without requesting further tool calls. This pattern is essential for building agents that perform multi-step tasks requiring several tool invocations.
Theory: The Agent Loop Pattern
Many real-world tasks require multiple sequential tool calls. For example, a research assistant might need to:
- Search for information (tool call 1)
- Read a specific document (tool call 2)
- Verify a fact (tool call 3)
- Compose a final answer (text response)
The agent loop automates this by repeatedly:
- Send request: Call the messages API with the current conversation history and tool definitions
- Check response: If the model responds with
stop_reason != "tool_use", the loop terminates - Execute tools: For each
tool_useblock, invoke the corresponding function - Append results: Add the assistant message and tool results to conversation history
- Repeat: Go back to step 1 with the updated history
+-------------------+
| Send API Request |
+--------+----------+
|
+--------v----------+
| Model Response |
+--------+----------+
|
+----------+----------+
| |
stop_reason == stop_reason !=
"tool_use" "tool_use"
| |
+--------v--------+ +------v------+
| Execute Tools | | Return |
+--------+--------+ | Final Msg |
| +-------------+
+--------v--------+
| Append Results |
+--------+--------+
|
+---> (back to Send API Request)
Iterative Request-Execute-Respond Cycle
Each iteration of the loop is a complete round-trip:
- API call: The runner calls
client.beta.messages.parse()(or.stream()) with the accumulated message history - Response parsing: The response is parsed into a
ParsedBetaMessage - Tool dispatch: The runner looks up each requested tool by name in its registry, calls it with the provided input, and collects results
- Error handling: If a tool is not found or raises an exception, an error result is generated with
is_error: True - History update: The assistant response and tool results are appended to the messages array
- Loop control: The runner checks
_should_stop()(max iterations) and whether tool calls were requested
The runner yields each ParsedBetaMessage as it is produced, making it an iterator -- callers can observe intermediate results or simply call .until_done() to drain the loop and get the final message.
Iteration Limits
To prevent runaway loops, the runner accepts a max_iterations parameter. When set, the loop terminates after that many API round-trips regardless of whether the model is still requesting tools. This is a safety mechanism for production systems where unbounded execution could consume excessive resources.
Context Compaction
Long-running tool loops accumulate large conversation histories that can exceed context window limits. The SDK addresses this through compaction control:
- Threshold monitoring: After each API call, the runner checks whether total token usage (input + output) exceeds a configurable threshold (default: 100,000 tokens)
- Automatic summarization: When the threshold is exceeded, the runner makes an additional API call with a summary prompt, asking the model to compress the conversation history into a continuation summary
- History replacement: The original message history is replaced with the compact summary, allowing the loop to continue within context limits
The compaction control is configured via a CompactionControl TypedDict:
| Field | Type | Default | Description |
|---|---|---|---|
enabled |
bool (required) |
-- | Whether compaction is active |
context_token_threshold |
int |
100,000 | Token count threshold that triggers compaction |
model |
str |
Same as runner model | Model to use for generating the summary |
summary_prompt |
str |
Built-in prompt | Instructions for the summarization |
Sync and Async Variants
The runner comes in four variants to support different execution models:
| Class | Execution | Communication | Use Case |
|---|---|---|---|
BetaToolRunner |
Sync | Non-streaming | Simple batch processing |
BetaStreamingToolRunner |
Sync | Streaming | Real-time UI updates |
BetaAsyncToolRunner |
Async | Non-streaming | Async applications |
BetaAsyncStreamingToolRunner |
Async | Streaming | Async with real-time updates |
All variants share the same core loop logic, differing only in their I/O patterns (sync vs async, parsed vs streaming).
Benefits Over Manual Loops
- Reduced boilerplate: No need to write the while loop, tool dispatch, error handling, and message appending for every application
- Consistent error handling: Tool execution errors are caught, logged, and reported to the model uniformly
- Context management: Automatic compaction prevents context overflow in long conversations
- Iteration safety:
max_iterationsprevents runaway loops - Observable: The iterator interface allows monitoring intermediate steps