Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Anthropics Anthropic sdk python Automated Tool Loop

From Leeroopedia
Knowledge Sources
Domains Tool_Use, LLM, Function_Calling
Last Updated 2026-02-15 00:00 GMT

Overview

The Automated Tool Loop is an agent-style pattern that orchestrates the entire tool use cycle -- request, detect, execute, submit -- in an automated loop. Instead of manually writing the iteration logic for each tool call round-trip, the SDK provides a runner that handles the complete lifecycle, iterating until the model produces a final response without requesting further tool calls. This pattern is essential for building agents that perform multi-step tasks requiring several tool invocations.

Theory: The Agent Loop Pattern

Many real-world tasks require multiple sequential tool calls. For example, a research assistant might need to:

  1. Search for information (tool call 1)
  2. Read a specific document (tool call 2)
  3. Verify a fact (tool call 3)
  4. Compose a final answer (text response)

The agent loop automates this by repeatedly:

  1. Send request: Call the messages API with the current conversation history and tool definitions
  2. Check response: If the model responds with stop_reason != "tool_use", the loop terminates
  3. Execute tools: For each tool_use block, invoke the corresponding function
  4. Append results: Add the assistant message and tool results to conversation history
  5. Repeat: Go back to step 1 with the updated history
                    +-------------------+
                    | Send API Request  |
                    +--------+----------+
                             |
                    +--------v----------+
                    | Model Response    |
                    +--------+----------+
                             |
                  +----------+----------+
                  |                     |
           stop_reason ==         stop_reason !=
            "tool_use"              "tool_use"
                  |                     |
         +--------v--------+    +------v------+
         | Execute Tools   |    | Return      |
         +--------+--------+    | Final Msg   |
                  |             +-------------+
         +--------v--------+
         | Append Results  |
         +--------+--------+
                  |
                  +---> (back to Send API Request)

Iterative Request-Execute-Respond Cycle

Each iteration of the loop is a complete round-trip:

  1. API call: The runner calls client.beta.messages.parse() (or .stream()) with the accumulated message history
  2. Response parsing: The response is parsed into a ParsedBetaMessage
  3. Tool dispatch: The runner looks up each requested tool by name in its registry, calls it with the provided input, and collects results
  4. Error handling: If a tool is not found or raises an exception, an error result is generated with is_error: True
  5. History update: The assistant response and tool results are appended to the messages array
  6. Loop control: The runner checks _should_stop() (max iterations) and whether tool calls were requested

The runner yields each ParsedBetaMessage as it is produced, making it an iterator -- callers can observe intermediate results or simply call .until_done() to drain the loop and get the final message.

Iteration Limits

To prevent runaway loops, the runner accepts a max_iterations parameter. When set, the loop terminates after that many API round-trips regardless of whether the model is still requesting tools. This is a safety mechanism for production systems where unbounded execution could consume excessive resources.

Context Compaction

Long-running tool loops accumulate large conversation histories that can exceed context window limits. The SDK addresses this through compaction control:

  • Threshold monitoring: After each API call, the runner checks whether total token usage (input + output) exceeds a configurable threshold (default: 100,000 tokens)
  • Automatic summarization: When the threshold is exceeded, the runner makes an additional API call with a summary prompt, asking the model to compress the conversation history into a continuation summary
  • History replacement: The original message history is replaced with the compact summary, allowing the loop to continue within context limits

The compaction control is configured via a CompactionControl TypedDict:

Field Type Default Description
enabled bool (required) -- Whether compaction is active
context_token_threshold int 100,000 Token count threshold that triggers compaction
model str Same as runner model Model to use for generating the summary
summary_prompt str Built-in prompt Instructions for the summarization

Sync and Async Variants

The runner comes in four variants to support different execution models:

Class Execution Communication Use Case
BetaToolRunner Sync Non-streaming Simple batch processing
BetaStreamingToolRunner Sync Streaming Real-time UI updates
BetaAsyncToolRunner Async Non-streaming Async applications
BetaAsyncStreamingToolRunner Async Streaming Async with real-time updates

All variants share the same core loop logic, differing only in their I/O patterns (sync vs async, parsed vs streaming).

Benefits Over Manual Loops

  • Reduced boilerplate: No need to write the while loop, tool dispatch, error handling, and message appending for every application
  • Consistent error handling: Tool execution errors are caught, logged, and reported to the model uniformly
  • Context management: Automatic compaction prevents context overflow in long conversations
  • Iteration safety: max_iterations prevents runaway loops
  • Observable: The iterator interface allows monitoring intermediate steps

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment