Principle:Langchain ai Langgraph Agent Execution
| Attribute | Value |
|---|---|
| Concept | Running a compiled agent graph and processing its iterative tool-calling behavior |
| Workflow | ReAct_Agent_Creation |
| Type | Principle |
| Repository | Langchain_ai_Langgraph |
| Source | libs/langgraph/langgraph/pregel/main.py
|
Overview
Agent execution is the runtime phase where a compiled ReAct agent graph processes user input through its iterative reasoning-and-action loop. The execution engine, implemented in the Pregel class (the base class for CompiledStateGraph), manages the step-by-step traversal of graph nodes, state propagation between steps, checkpoint persistence, and output streaming. For ReAct agents specifically, execution involves receiving a message-based input ({"messages": [...]}), running the agent-tools loop until completion, and returning the final state containing the full conversation history and any structured responses.
Description
The Agent Execution Loop
When a ReAct agent is invoked, the execution engine performs the following cycle:
- Input processing: The input (typically
{"messages": [HumanMessage(...)]}) is merged into the graph state using the state schema's reducers (e.g.,add_messagesfor the messages key). - Agent node execution: The "agent" node calls the language model with the current message history. The model produces an
AIMessagethat may contain tool calls. - Conditional routing: The graph's conditional edges examine the
AIMessage. If tool calls are present, execution continues to the "tools" node; otherwise, it terminates (or routes to structured response generation). - Tool execution: The
ToolNodeexecutes the requested tools and producesToolMessageresults that are appended to the message history. - Loop continuation: Control returns to the agent node, and steps 2-4 repeat.
- Termination: When the model produces an
AIMessagewithout tool calls, the loop ends and the final state is returned.
Each step in this loop corresponds to a "superstep" in the Pregel execution model, where all nodes scheduled for execution in a given step run (potentially in parallel) before advancing to the next step.
Message-Based I/O
ReAct agents use a message-based I/O convention where:
- Input: A dictionary with a
"messages"key containing a list of messages. The simplest form is{"messages": [("user", "Hello")]}using tuple shorthand, or{"messages": [HumanMessage("Hello")]}using explicit message objects.
- Output: A dictionary with a
"messages"key containing the complete conversation history, including the original input, all intermediateAIMessageandToolMessageexchanges, and the finalAIMessageresponse.
The add_messages reducer on the messages field handles message accumulation, ensuring that new messages are appended rather than replacing the existing list. This enables the conversation history to grow naturally across the agent loop.
Invoke vs. Stream
The execution engine provides two primary interfaces:
- invoke: Runs the graph to completion and returns the final state. Internally,
invokecallsstreamwithstream_mode=["updates", "values"]and collects the final value. This is the simplest way to run an agent when you only need the final result.
- stream: Returns an iterator that yields intermediate results as the graph executes. The
stream_modeparameter controls what is emitted:"values": Full state after each step (including interrupts)."updates": Only the node name and its output for each step."messages": LLM tokens as they are generated, as(token, metadata)tuples."custom": Custom data emitted by nodes viaStreamWriter."debug": Detailed debug information for each step."checkpoints": Checkpoint events in the format returned byget_state()."tasks": Task start/finish events with results and errors.
Multiple stream modes can be combined by passing a list, in which case output tuples include the mode identifier.
Streaming Agent Events
For ReAct agents, streaming is particularly valuable because the agent loop may involve multiple LLM calls and tool executions. Common streaming patterns include:
- Token-level streaming: Using
stream_mode="messages"to display LLM output as it is generated, providing real-time feedback to users. - Step-level streaming: Using
stream_mode="updates"to observe each node's output, enabling progress tracking through the reasoning cycle. - Combined streaming: Passing
stream_mode=["messages", "updates"]to get both token-level and step-level events.
Checkpointing and Resumption
When a checkpointer is configured, the execution engine persists the graph state after each superstep. This enables:
- Conversation memory: Resuming a conversation by providing the same
thread_idin the config. - Human-in-the-loop: Using
interrupt_beforeorinterrupt_afterto pause execution at specific nodes, allowing external review or approval before continuing. - Error recovery: Resuming from the last successful checkpoint after a failure.
To resume execution after an interrupt, call invoke or stream with None as the input and the same thread configuration.
Recursion Limit
The execution engine enforces a recursion limit (configurable via config["recursion_limit"]) to prevent infinite loops. If the agent loop exceeds this limit, a GraphRecursionError is raised. The remaining_steps state field tracks how many steps remain, and the agent proactively stops when steps are nearly exhausted.
Usage
from langgraph.prebuilt import create_react_agent
from langchain_core.tools import tool
@tool
def search(query: str) -> str:
"""Search for information."""
return f"Results for: {query}"
agent = create_react_agent("openai:gpt-4", tools=[search])
# Invoke - get final result
result = agent.invoke({"messages": [("user", "Search for LangGraph")]})
print(result["messages"][-1].content)
# Stream - observe each step
for chunk in agent.stream(
{"messages": [("user", "Search for LangGraph")]},
stream_mode="updates",
):
print(chunk)
# Stream with token-level output
for chunk in agent.stream(
{"messages": [("user", "Search for LangGraph")]},
stream_mode="messages",
):
token, metadata = chunk
print(token.content, end="", flush=True)
Theoretical Basis
The agent execution model is built on the Pregel computation framework (inspired by Google's Pregel system for large-scale graph processing). In this model, computation proceeds in synchronized "supersteps" where:
- All active nodes execute in parallel.
- Messages (state updates) from this step are collected.
- The next set of active nodes is determined based on edge conditions.
- The process repeats until no more nodes are active.
This maps naturally to the ReAct loop: each superstep corresponds to either an LLM reasoning phase or a tool execution phase. The synchronization barrier between supersteps ensures that tool results are fully available before the model's next reasoning step.
The streaming interface implements the observer pattern, allowing external consumers to react to internal graph events without modifying the execution logic. This separation of concerns enables flexible UIs, logging systems, and monitoring tools to be attached to agent execution without coupling.
The checkpoint-based resumption follows the event sourcing pattern, where the state at any point can be reconstructed from the sequence of state transitions (checkpoints). This provides durability guarantees and enables time-travel debugging of agent behavior.