Implementation:Run llama Llama index ReActAgent Run

Overview

BaseWorkflowAgent.run is the entry point for executing a ReAct agent. It accepts a user message, initializes the workflow context, and runs the full ReAct loop (Thought-Action-Observation cycles) until the agent produces a final answer or the iteration limit is reached. The method returns a WorkflowHandler that yields the final AgentOutput.

Principle:Run_llama_Llama_index_Agent_Execution

Source File

llama-index-core/llama_index/core/agent/workflow/base_agent.py, Lines 722-751

Method Signature

def run(
    self,
    user_msg: Optional[Union[str, ChatMessage]] = None,
    chat_history: Optional[List[ChatMessage]] = None,
    memory: Optional[BaseMemory] = None,
    ctx: Optional[Context] = None,
    max_iterations: Optional[int] = None,
    early_stopping_method: Optional[Literal["force", "generate"]] = None,
    start_event: Optional[AgentWorkflowStartEvent] = None,
    **kwargs: Any,
) -> WorkflowHandler:

Parameters

Parameter	Type	Default	Description
user_msg	`Optional[Union[str, ChatMessage]]`	`None`	The user's input message. Can be a plain string (auto-converted to `ChatMessage`) or a pre-built `ChatMessage`. Either this or `chat_history` must be provided.
chat_history	`Optional[List[ChatMessage]]`	`None`	Pre-existing chat history to initialize the memory with. Used for multi-turn conversations.
memory	`Optional[BaseMemory]`	`None`	Custom memory instance. If not provided, a `ChatMemoryBuffer` is created using the agent's LLM.
ctx	`Optional[Context]`	`None`	Pre-existing workflow context. If provided and running, enables human-in-the-loop (HITL) patterns.
max_iterations	`Optional[int]`	`None`	Maximum number of Thought-Action-Observation cycles. Defaults to 20 if not specified.
early_stopping_method	`Optional[Literal["force", "generate"]]`	`None`	Override for the agent's early stopping behavior. Falls back to the agent's configured value.
start_event	`Optional[AgentWorkflowStartEvent]`	`None`	Pre-built start event. If provided, all other parameters except `ctx` are ignored.
**kwargs	`Any`	--	Additional keyword arguments passed through to the start event.

Return Value

Returns a WorkflowHandler -- an async-compatible handler that manages the workflow execution. The handler can be used in two ways:

# Await the final result
handler = agent.run("What is the capital of France?")
result = await handler  # Returns AgentOutput

# Stream events during execution
handler = agent.run("Analyze this data...")
async for event in handler.stream_events():
    if isinstance(event, AgentStream):
        print(event.delta, end="")
result = await handler

Internal Workflow Steps

The run method triggers a sequence of workflow steps, each decorated with @step:

Step 1: init_run

Source: base_agent.py, Lines 380-432

Processes the AgentWorkflowStartEvent and sets up the context:

@step
async def init_run(self, ctx: Context, ev: AgentWorkflowStartEvent) -> AgentInput:
    await self._init_context(ctx, ev)
    # Converts user_msg to ChatMessage if string
    # Adds messages to memory
    # Returns AgentInput with full message history

Step 2: setup_agent

Source: base_agent.py, Lines 434-460

Prepares the LLM input with system prompt and state:

@step
async def setup_agent(self, ctx: Context, ev: AgentInput) -> AgentSetup:
    # Prepends system prompt
    # Injects agent state into the last message
    # Returns AgentSetup with formatted LLM input

Step 3: run_agent_step

Source: base_agent.py, Lines 462-477

Invokes the agent's take_step method (overridden by ReActAgent):

@step
async def run_agent_step(self, ctx: Context, ev: AgentSetup) -> AgentOutput:
    tools = await self.get_tools(user_msg_str)
    agent_output = await self.take_step(ctx, ev.input, tools, memory)
    ctx.write_event_to_stream(agent_output)
    return agent_output

Step 4: parse_agent_output

Source: base_agent.py, Lines 517-617

Routes the agent's output to the appropriate next step:

If tool_calls are present: emits ToolCall events for each call
If retry_messages are present: loops back to AgentInput with correction context
If no tool calls (final answer): calls finalize() and emits StopEvent
Enforces max_iterations and applies early stopping

Step 5: call_tool

Source: base_agent.py, Lines 619-654

Executes a single tool call:

@step
async def call_tool(self, ctx: Context, ev: ToolCall) -> ToolCallResult:
    # Looks up tool by name
    # Injects Context if the tool requires it
    # Calls tool.acall(**tool_kwargs)
    # Returns ToolCallResult with output (or error)

Step 6: aggregate_tool_results

Source: base_agent.py, Lines 656-720

Collects all tool results and loops back to the reasoning step:

@step
async def aggregate_tool_results(self, ctx: Context, ev: ToolCallResult) -> Union[AgentInput, StopEvent, None]:
    # Collects all ToolCallResult events
    # Delegates to handle_tool_call_results (adds ObservationReasoningSteps)
    # If return_direct, finalizes and stops
    # Otherwise, returns AgentInput to continue the loop

ReActAgent.take_step

Source: react_agent.py, Lines 117-258

This is the ReAct-specific core logic within each iteration:

async def take_step(self, ctx, llm_input, tools, memory) -> AgentOutput:
    # 1. Format input with ReActChatFormatter (tools + history + reasoning chain)
    # 2. Call LLM (streaming or non-streaming)
    # 3. Parse output with ReActOutputParser
    # 4. Handle errors: empty response or parse failure -> retry
    # 5. Append reasoning step to chain
    # 6. If ResponseReasoningStep (is_done=True) -> return final AgentOutput
    # 7. If ActionReasoningStep -> create ToolSelection and return AgentOutput with tool_calls

Usage Examples

Basic Execution

from llama_index.core.agent.workflow import ReActAgent

agent = ReActAgent(tools=[multiply_tool, search_tool], llm=llm)

# Simple run
handler = agent.run("What is 6 times 7?")
response = await handler
print(response)  # "42"

Streaming Execution

handler = agent.run("Research the latest AI trends and summarize them.")

async for event in handler.stream_events():
    if isinstance(event, AgentStream):
        print(event.delta, end="", flush=True)

result = await handler

Multi-Turn Conversation

from llama_index.core.memory import ChatMemoryBuffer

memory = ChatMemoryBuffer.from_defaults(llm=llm)

# First turn
handler = agent.run("What is the population of France?", memory=memory)
result1 = await handler

# Second turn -- memory carries over
handler = agent.run("How does that compare to Germany?", memory=memory)
result2 = await handler

With Custom Iteration Limits

handler = agent.run(
    "Solve this complex multi-step problem...",
    max_iterations=50,
    early_stopping_method="generate",
)
result = await handler

Import Statement

from llama_index.core.agent.workflow import ReActAgent
# The run() method is called on the agent instance:
# handler = agent.run("user message")
# result = await handler

2026-02-11 00:00 GMT

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment