Principle:Run llama Llama index Agent Configuration
Overview
Agent Configuration covers the ReAct (Reasoning + Acting) agent paradigm and how to configure a ReAct agent in LlamaIndex. The ReAct framework, introduced by Yao et al. (2022), enables LLMs to interleave reasoning traces (Thought) with concrete actions (Action), observe the results (Observation), and then reason again. This creates a synergistic loop where reasoning helps the model decide what to do, and actions ground the model's reasoning in real-world information.
AI Agents ReAct Chain-of-Thought LlamaIndex
The ReAct Paradigm
Traditional LLM usage follows a single-turn pattern: the user provides input, the model generates output. The ReAct paradigm transforms this into a multi-turn reasoning loop:
- Thought -- The LLM reasons about the current state, what information it needs, and what action to take next
- Action -- The LLM selects a tool and provides the input arguments
- Observation -- The tool executes and returns its output
- Repeat -- The LLM observes the result and decides whether to take another action or produce a final answer
This loop continues until the LLM determines it has enough information to answer the user's question, at which point it produces:
- Thought -- Final reasoning about the gathered information
- Answer -- The synthesized response to the user
Comparison with Other Paradigms
| Paradigm | Reasoning | Action | Grounding |
|---|---|---|---|
| Standard Prompting | None | None | None -- relies entirely on parametric knowledge |
| Chain-of-Thought (CoT) | Yes | None | None -- reasons but cannot access external information |
| Act-Only (e.g., Toolformer) | None | Yes | Yes -- takes actions but without explicit reasoning |
| ReAct | Yes | Yes | Yes -- reasons about which actions to take, observes results, reasons again |
The key insight is that reasoning helps the model plan and interpret, while actions help the model gather information and verify its reasoning. Neither alone is sufficient for complex, multi-step tasks.
Configuring a ReAct Agent
A ReAct agent in LlamaIndex is configured through the ReActAgent class (which extends BaseWorkflowAgent). Configuration involves specifying:
Core Components
- LLM -- The language model that performs reasoning and generates tool calls. Any LLM that supports text generation works with ReAct (it does not require native function calling support, since ReAct uses text-based tool invocation).
- Tools -- The set of tools available to the agent. These define what actions the agent can take. Tools can be provided as
BaseToolinstances or plain Python callables (which are auto-wrapped asFunctionTool).
- System Prompt -- Optional instructions that guide the agent's behavior, persona, or constraints. This is prepended to the ReAct prompt template.
ReAct-Specific Configuration
- ReActChatFormatter -- Controls how the tool descriptions, chat history, and current reasoning steps are formatted into the LLM prompt. Includes the system header that instructs the LLM on the Thought/Action/Observation format.
- ReActOutputParser -- Parses the LLM's text output to extract structured reasoning steps (thoughts, actions, action inputs, and answers).
- Reasoning Key -- The key used to store the current chain of reasoning steps in the workflow context store (defaults to
"current_reasoning").
Execution Control
- Streaming -- Whether to stream the LLM's token-by-token output during generation (default:
True).
- Early Stopping Method -- What happens when the maximum iteration count is reached:
"force"(default) -- Raises aWorkflowRuntimeError"generate"-- Makes one final LLM call to synthesize a response from the information gathered so far
- Timeout -- Optional float specifying the maximum execution time in seconds for the entire workflow.
- Max Iterations -- The maximum number of Thought-Action-Observation cycles before early stopping (default: 20, set at runtime via
run()).
Advanced Configuration
- Tool Retriever -- An
ObjectRetrieverthat dynamically selects relevant tools based on the user's query, rather than providing all tools every time. Useful when the agent has access to many tools.
- Can Handoff To -- A list of agent names this agent can hand off to in a multi-agent workflow.
- Initial State -- A dictionary of initial state values accessible via the workflow context store.
- Output Class -- An optional Pydantic
BaseModelsubclass for structured output generation after the agent completes.
ReAct vs Function Calling Agents
LlamaIndex provides both ReAct-style and function-calling-style agents. The key difference:
| Aspect | ReAct Agent | Function Calling Agent |
|---|---|---|
| Tool Invocation | Text-based: LLM outputs "Action: tool_name" and "Action Input: {...}" | Native API: LLM uses built-in function calling (e.g., OpenAI tool_calls) |
| LLM Requirements | Any text-generation LLM | Must support native function calling |
| Reasoning Trace | Explicit "Thought:" steps visible in output | Reasoning is implicit in the model's internal processing |
| Flexibility | Works with any model, including open-source | Tied to models/providers with function calling support |
| Reliability | May occasionally produce parsing errors | More reliable tool calls via structured API responses |
Knowledge Sources
ReAct: Synergizing Reasoning and Acting in Language Models LlamaIndex Agents Documentation LlamaIndex GitHub Repository
Implementation
Implementation:Run_llama_Llama_index_ReActAgent_From_Defaults