Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Heuristic:Infiniflow Ragflow Agent Max Rounds Strategy

From Leeroopedia
Knowledge Sources
Domains Agents, Optimization
Last Updated 2026-02-12 06:00 GMT

Overview

Agent tool-calling loops are capped at 5 rounds (configurable via max_rounds), after which the agent is force-prompted to synthesize a final answer from all gathered information.

Description

RAGFlow's agent system uses a tool-calling loop where the LLM can invoke external tools (search, SQL, API calls, etc.) iteratively. To prevent infinite loops and excessive API costs, the loop is capped at `max_rounds` (default 5). When the limit is reached, the agent receives a forced instruction to "provide a DIRECT and COMPREHENSIVE final answer" using all information gathered during the tool-calling rounds. Additionally, conversation history is compressed when it exceeds 12 messages by keeping only the first 2 messages and last 10.

Usage

This heuristic applies to all agent workflows that use tool-calling. Configure `max_rounds` in the agent component settings. Increase it for complex multi-step research tasks; decrease it for cost-sensitive applications. The history compression activates automatically for long conversations.

The Insight (Rule of Thumb)

  • Action: Set `max_rounds=5` (default). When exceeded, force a final answer prompt. Compress history at >12 messages.
  • Value: max_rounds=5, history compression threshold=12 messages (keep first 2 + last 10).
  • Trade-off: Higher max_rounds allows more thorough research but increases latency and API costs. Lower values may produce incomplete answers for complex queries.

Reasoning

Five rounds is sufficient for most tool-calling scenarios (e.g., search → refine → verify → answer). The forced final answer ensures the user always gets a response rather than an error. History compression prevents context window overflow while preserving the system prompt (message 0), initial user query (message 1), and the most recent conversation turns (last 10). Citations are only inserted when history length is less than 7 to avoid cluttering multi-turn conversations.

Code Evidence from `agent/component/agent_with_tools.py:77,417-429`:

for _ in range(self._param.max_rounds + 1):
    # ... tool calling loop ...

logging.warning(f"Exceed max rounds: {self._param.max_rounds}")
final_instruction = ("Based on ALL the information you have gathered from your "
                     "research and tool calls above, please provide a DIRECT and "
                     "COMPREHENSIVE final answer...")

History compression from `agent/component/agent_with_tools.py:479-480`:

# If history > 12 messages, compress to keep context window manageable
_hist = [hist[0], hist[1], *hist[-10:]]

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment