Implementation:Run llama Llama index IndexDocumentSummary

Knowledge Sources	Run_llama_Llama_index
Domains	LLM Framework, Data Structures, Document Summary
Last Updated	2026-02-11 19:00 GMT

Overview

IndexDocumentSummary is a data structure class that maintains bidirectional mappings between document summaries and their constituent nodes, serving as the internal storage model for the DocumentSummaryIndex.

Description

The IndexDocumentSummary class extends IndexStruct and maintains three dictionary mappings to enable efficient lookup in both directions between summaries, nodes, and source documents:

summary_id_to_node_ids -- Maps each summary node ID to the list of content node IDs it summarizes (Dict[str, List[str]]).
node_id_to_summary_id -- Reverse mapping from each content node ID back to its summary node ID (Dict[str, str]).
doc_id_to_summary_id -- Maps source document IDs (ref_doc_id) to their summary node IDs (Dict[str, str]).

Key methods:

add_summary_and_nodes(summary_node, nodes) -- Registers a summary node and its associated content nodes. Extracts the ref_doc_id from the summary node (raises ValueError if it is None). Populates all three mapping dictionaries and returns the summary ID.

summary_ids (property) -- Returns a list of all summary node IDs.

delete(doc_id) -- Deletes a document and all associated mappings. Removes the doc-to-summary mapping, all node-to-summary mappings for the document's nodes, and the summary-to-node mapping.

delete_nodes(node_ids) -- Removes specific nodes from their summary's node list and cleans up the node-to-summary mapping, without deleting the entire document.

get_type() -- Returns IndexStructType.DOCUMENT_SUMMARY.

Usage

This data structure is used internally by DocumentSummaryIndex to track which nodes belong to which document summaries. It is persisted as part of index storage and is not typically accessed directly by end users.

Code Reference

Source Location

Repository: Run_llama_Llama_index
File: llama-index-core/llama_index/core/data_structs/document_summary.py
Lines: 1-74

Signature

@dataclass
class IndexDocumentSummary(IndexStruct):
    summary_id_to_node_ids: Dict[str, List[str]] = field(default_factory=dict)
    node_id_to_summary_id: Dict[str, str] = field(default_factory=dict)
    doc_id_to_summary_id: Dict[str, str] = field(default_factory=dict)

    def add_summary_and_nodes(
        self,
        summary_node: BaseNode,
        nodes: List[BaseNode],
    ) -> str: ...

    @property
    def summary_ids(self) -> List[str]: ...

    def delete(self, doc_id: str) -> None: ...

    def delete_nodes(self, node_ids: List[str]) -> None: ...

    @classmethod
    def get_type(cls) -> IndexStructType: ...

Import

from llama_index.core.data_structs.document_summary import IndexDocumentSummary

I/O Contract

Inputs

Name	Type	Required	Description
summary_node	BaseNode	Yes (for add_summary_and_nodes)	The summary node representing a document summary. Must have a non-None ref_doc_id.
nodes	List[BaseNode]	Yes (for add_summary_and_nodes)	The content nodes that the summary covers.
doc_id	str	Yes (for delete)	The document ID (ref_doc_id) of the document to remove.
node_ids	List[str]	Yes (for delete_nodes)	Specific node IDs to remove from the index structure.

Outputs

Name	Type	Description
return (add_summary_and_nodes)	str	The summary node ID that was registered.
summary_ids	List[str]	All summary node IDs in the index structure.
get_type()	IndexStructType	Returns IndexStructType.DOCUMENT_SUMMARY.

Usage Examples

Basic Usage

from llama_index.core.data_structs.document_summary import IndexDocumentSummary
from llama_index.core.schema import TextNode

# Create the index structure
doc_summary = IndexDocumentSummary()

# Create a summary node with a reference to its source document
summary_node = TextNode(
    text="This document discusses LlamaIndex architecture.",
    id_="summary-1",
)
summary_node.relationships = {
    "1": {"node_id": "doc-1"}  # ref_doc_id
}

# Create content nodes
content_nodes = [
    TextNode(text="LlamaIndex provides data connectors.", id_="node-1"),
    TextNode(text="LlamaIndex supports multiple index types.", id_="node-2"),
]

# Add summary and nodes
summary_id = doc_summary.add_summary_and_nodes(summary_node, content_nodes)
print(f"Summary ID: {summary_id}")
print(f"All summaries: {doc_summary.summary_ids}")

# Delete a document
doc_summary.delete("doc-1")

Related Pages

Environment:Run_llama_Llama_index_Python_LlamaIndex_Core

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment