Principle:Langchain ai Langgraph Agent Execution

Attribute	Value
Concept	Running a compiled agent graph and processing its iterative tool-calling behavior
Workflow	ReAct_Agent_Creation
Type	Principle
Repository	Langchain_ai_Langgraph
Source	`libs/langgraph/langgraph/pregel/main.py`

Overview

Agent execution is the runtime phase where a compiled ReAct agent graph processes user input through its iterative reasoning-and-action loop. The execution engine, implemented in the Pregel class (the base class for CompiledStateGraph), manages the step-by-step traversal of graph nodes, state propagation between steps, checkpoint persistence, and output streaming. For ReAct agents specifically, execution involves receiving a message-based input ({"messages": [...]}), running the agent-tools loop until completion, and returning the final state containing the full conversation history and any structured responses.

Description

The Agent Execution Loop

When a ReAct agent is invoked, the execution engine performs the following cycle:

Input processing: The input (typically {"messages": [HumanMessage(...)]}) is merged into the graph state using the state schema's reducers (e.g., add_messages for the messages key).
Agent node execution: The "agent" node calls the language model with the current message history. The model produces an AIMessage that may contain tool calls.
Conditional routing: The graph's conditional edges examine the AIMessage. If tool calls are present, execution continues to the "tools" node; otherwise, it terminates (or routes to structured response generation).
Tool execution: The ToolNode executes the requested tools and produces ToolMessage results that are appended to the message history.
Loop continuation: Control returns to the agent node, and steps 2-4 repeat.
Termination: When the model produces an AIMessage without tool calls, the loop ends and the final state is returned.

Each step in this loop corresponds to a "superstep" in the Pregel execution model, where all nodes scheduled for execution in a given step run (potentially in parallel) before advancing to the next step.

Message-Based I/O

ReAct agents use a message-based I/O convention where:

Input: A dictionary with a "messages" key containing a list of messages. The simplest form is {"messages": [("user", "Hello")]} using tuple shorthand, or {"messages": [HumanMessage("Hello")]} using explicit message objects.

Output: A dictionary with a "messages" key containing the complete conversation history, including the original input, all intermediate AIMessage and ToolMessage exchanges, and the final AIMessage response.

The add_messages reducer on the messages field handles message accumulation, ensuring that new messages are appended rather than replacing the existing list. This enables the conversation history to grow naturally across the agent loop.

Invoke vs. Stream

The execution engine provides two primary interfaces:

invoke: Runs the graph to completion and returns the final state. Internally, invoke calls stream with stream_mode=["updates", "values"] and collects the final value. This is the simplest way to run an agent when you only need the final result.

stream: Returns an iterator that yields intermediate results as the graph executes. The stream_mode parameter controls what is emitted:
- "values": Full state after each step (including interrupts).
- "updates": Only the node name and its output for each step.
- "messages": LLM tokens as they are generated, as (token, metadata) tuples.
- "custom": Custom data emitted by nodes via StreamWriter.
- "debug": Detailed debug information for each step.
- "checkpoints": Checkpoint events in the format returned by get_state().
- "tasks": Task start/finish events with results and errors.

Multiple stream modes can be combined by passing a list, in which case output tuples include the mode identifier.

Streaming Agent Events

For ReAct agents, streaming is particularly valuable because the agent loop may involve multiple LLM calls and tool executions. Common streaming patterns include:

Token-level streaming: Using stream_mode="messages" to display LLM output as it is generated, providing real-time feedback to users.
Step-level streaming: Using stream_mode="updates" to observe each node's output, enabling progress tracking through the reasoning cycle.
Combined streaming: Passing stream_mode=["messages", "updates"] to get both token-level and step-level events.

Checkpointing and Resumption

When a checkpointer is configured, the execution engine persists the graph state after each superstep. This enables:

Conversation memory: Resuming a conversation by providing the same thread_id in the config.
Human-in-the-loop: Using interrupt_before or interrupt_after to pause execution at specific nodes, allowing external review or approval before continuing.
Error recovery: Resuming from the last successful checkpoint after a failure.

To resume execution after an interrupt, call invoke or stream with None as the input and the same thread configuration.

Recursion Limit

The execution engine enforces a recursion limit (configurable via config["recursion_limit"]) to prevent infinite loops. If the agent loop exceeds this limit, a GraphRecursionError is raised. The remaining_steps state field tracks how many steps remain, and the agent proactively stops when steps are nearly exhausted.

Usage

from langgraph.prebuilt import create_react_agent
from langchain_core.tools import tool

@tool
def search(query: str) -> str:
    """Search for information."""
    return f"Results for: {query}"

agent = create_react_agent("openai:gpt-4", tools=[search])

# Invoke - get final result
result = agent.invoke({"messages": [("user", "Search for LangGraph")]})
print(result["messages"][-1].content)

# Stream - observe each step
for chunk in agent.stream(
    {"messages": [("user", "Search for LangGraph")]},
    stream_mode="updates",
):
    print(chunk)

# Stream with token-level output
for chunk in agent.stream(
    {"messages": [("user", "Search for LangGraph")]},
    stream_mode="messages",
):
    token, metadata = chunk
    print(token.content, end="", flush=True)

Theoretical Basis

The agent execution model is built on the Pregel computation framework (inspired by Google's Pregel system for large-scale graph processing). In this model, computation proceeds in synchronized "supersteps" where:

All active nodes execute in parallel.
Messages (state updates) from this step are collected.
The next set of active nodes is determined based on edge conditions.
The process repeats until no more nodes are active.

This maps naturally to the ReAct loop: each superstep corresponds to either an LLM reasoning phase or a tool execution phase. The synchronization barrier between supersteps ensures that tool results are fully available before the model's next reasoning step.

The streaming interface implements the observer pattern, allowing external consumers to react to internal graph events without modifying the execution logic. This separation of concerns enables flexible UIs, logging systems, and monitoring tools to be attached to agent execution without coupling.

The checkpoint-based resumption follows the event sourcing pattern, where the state at any point can be reconstructed from the sequence of state transitions (checkpoints). This provides durability guarantees and enables time-travel debugging of agent behavior.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment