Implementation:Run llama Llama index RecursiveRetriever

Overview

The RecursiveRetriever implements a graph-based recursive retrieval strategy. Starting from a root retriever, it follows IndexNode links to other retrievers, query engines, or plain nodes, recursively expanding the retrieval graph. This allows building multi-level retrieval pipelines where results from one retriever can reference other retrievers or query engines.

Source File: llama-index-core/llama_index/core/retrievers/recursive_retriever.py (221 lines)

Module: llama_index.core.retrievers.recursive_retriever

Class Definition

class RecursiveRetriever(BaseRetriever):
    """
    Recursive retriever.

    This retriever will recursively explore links from nodes to other
    retrievers/query engines.

    For any retrieved nodes, if any of the nodes are IndexNodes,
    then it will explore the linked retriever/query engine, and query that.
    """

Type Alias

RQN_TYPE = Union[BaseRetriever, BaseQueryEngine, BaseNode]

This union type represents the three kinds of objects that can be referenced in the recursive graph: retrievers, query engines, and plain nodes.

Dependencies

Module	Import
`llama_index.core.base.base_query_engine`	`BaseQueryEngine`
`llama_index.core.base.base_retriever`	`BaseRetriever`
`llama_index.core.callbacks.base`	`CallbackManager`
`llama_index.core.callbacks.schema`	`CBEventType`, `EventPayload`
`llama_index.core.schema`	`BaseNode`, `IndexNode`, `NodeWithScore`, `QueryBundle`, `TextNode`
`llama_index.core.utils`	`print_text`

Constants

DEFAULT_QUERY_RESPONSE_TMPL = "Query: {query_str}\nResponse: {response}"

Template used to format query engine responses as text nodes.

Constructor

def __init__(
    self,
    root_id: str,
    retriever_dict: Dict[str, BaseRetriever],
    query_engine_dict: Optional[Dict[str, BaseQueryEngine]] = None,
    node_dict: Optional[Dict[str, BaseNode]] = None,
    callback_manager: Optional[CallbackManager] = None,
    query_response_tmpl: Optional[str] = None,
    verbose: bool = False,
) -> None

Parameter	Type	Default	Description
`root_id`	`str`	required	The ID of the root retriever in `retriever_dict` that serves as the entry point
`retriever_dict`	`Dict[str, BaseRetriever]`	required	Mapping of string IDs to retriever instances
`query_engine_dict`	`Optional[Dict[str, BaseQueryEngine]]`	`None`	Mapping of string IDs to query engine instances
`node_dict`	`Optional[Dict[str, BaseNode]]`	`None`	Mapping of string IDs to plain node instances
`callback_manager`	`Optional[CallbackManager]`	`None`	Optional callback manager
`query_response_tmpl`	`Optional[str]`	`None`	Template for formatting query engine responses
`verbose`	`bool`	`False`	Whether to print verbose retrieval info

Validation:

The root_id must exist in retriever_dict (raises ValueError otherwise).
Keys in retriever_dict and query_engine_dict must not overlap (raises ValueError otherwise).

Core Methods

_retrieve

def _retrieve(self, query_bundle: QueryBundle) -> List[NodeWithScore]

Public retrieval entry point. Calls _retrieve_rec with no initial query ID (defaults to root), returning only the primary retrieved nodes (discarding additional source nodes).

retrieve_all

def retrieve_all(
    self, query_bundle: QueryBundle
) -> Tuple[List[NodeWithScore], List[NodeWithScore]]

Extended retrieval that returns both the primary retrieved nodes and additional source nodes (e.g., source nodes from sub-query engine responses).

_retrieve_rec

def _retrieve_rec(
    self,
    query_bundle: QueryBundle,
    query_id: Optional[str] = None,
    cur_similarity: Optional[float] = None,
) -> Tuple[List[NodeWithScore], List[NodeWithScore]]

The core recursive method. Behavior depends on the type of object resolved by query_id:

Object Type	Behavior
`BaseNode`	Returns the node directly with the current similarity score
`BaseRetriever`	Calls `retrieve()`, then recursively processes results via `_query_retrieved_nodes()`
`BaseQueryEngine`	Calls `query()`, formats the response as a `TextNode` using the query response template, and collects source nodes

_query_retrieved_nodes

def _query_retrieved_nodes(
    self, query_bundle: QueryBundle, nodes_with_score: List[NodeWithScore]
) -> Tuple[List[NodeWithScore], List[NodeWithScore]]

Processes retrieved nodes. For each node:

If it is an IndexNode, follows the link by recursively calling _retrieve_rec with the node's index_id.
If it is a TextNode, keeps it as-is.

Deduplicates IndexNode references to the same index_id before recursing. After recursion, deduplicates all results by node ID.

_get_object

def _get_object(self, query_id: str) -> RQN_TYPE

Resolves a query_id to its corresponding object by searching in order: node_dict, retriever_dict, query_engine_dict. Raises ValueError if not found.

_deduplicate_nodes

def _deduplicate_nodes(
    self, nodes_with_score: List[NodeWithScore]
) -> List[NodeWithScore]

Removes duplicate nodes by node.id_, keeping the first occurrence (which has the highest score if results are ordered).

Recursion Flow

_retrieve(query)
  -> _retrieve_rec(query, root_id)
       -> retriever.retrieve(query)
            -> for each result node:
                 if IndexNode:
                   -> _retrieve_rec(query, node.index_id)
                        -> (may call another retriever, query engine, or return a node)
                 if TextNode:
                   -> return node directly

Design Notes

The three dictionaries (retriever_dict, query_engine_dict, node_dict) form a flat namespace. The recursive graph structure emerges from IndexNode references in retrieval results.
Query engine responses are converted to TextNode objects so they can be uniformly handled alongside retrieved text nodes.
The callback manager fires CBEventType.RETRIEVE events when entering a sub-retriever, enabling observability into the recursive chain.
Verbose mode uses colored terminal output (pink for node entering, blue for query IDs, green for query engine responses).

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment