Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Run llama Llama index IndexDocumentSummary

From Leeroopedia
Knowledge Sources
Domains LLM Framework, Data Structures, Document Summary
Last Updated 2026-02-11 19:00 GMT

Overview

IndexDocumentSummary is a data structure class that maintains bidirectional mappings between document summaries and their constituent nodes, serving as the internal storage model for the DocumentSummaryIndex.

Description

The IndexDocumentSummary class extends IndexStruct and maintains three dictionary mappings to enable efficient lookup in both directions between summaries, nodes, and source documents:

  • summary_id_to_node_ids -- Maps each summary node ID to the list of content node IDs it summarizes (Dict[str, List[str]]).
  • node_id_to_summary_id -- Reverse mapping from each content node ID back to its summary node ID (Dict[str, str]).
  • doc_id_to_summary_id -- Maps source document IDs (ref_doc_id) to their summary node IDs (Dict[str, str]).

Key methods:

  • add_summary_and_nodes(summary_node, nodes) -- Registers a summary node and its associated content nodes. Extracts the ref_doc_id from the summary node (raises ValueError if it is None). Populates all three mapping dictionaries and returns the summary ID.
  • summary_ids (property) -- Returns a list of all summary node IDs.
  • delete(doc_id) -- Deletes a document and all associated mappings. Removes the doc-to-summary mapping, all node-to-summary mappings for the document's nodes, and the summary-to-node mapping.
  • delete_nodes(node_ids) -- Removes specific nodes from their summary's node list and cleans up the node-to-summary mapping, without deleting the entire document.
  • get_type() -- Returns IndexStructType.DOCUMENT_SUMMARY.

Usage

This data structure is used internally by DocumentSummaryIndex to track which nodes belong to which document summaries. It is persisted as part of index storage and is not typically accessed directly by end users.

Code Reference

Source Location

  • Repository: Run_llama_Llama_index
  • File: llama-index-core/llama_index/core/data_structs/document_summary.py
  • Lines: 1-74

Signature

@dataclass
class IndexDocumentSummary(IndexStruct):
    summary_id_to_node_ids: Dict[str, List[str]] = field(default_factory=dict)
    node_id_to_summary_id: Dict[str, str] = field(default_factory=dict)
    doc_id_to_summary_id: Dict[str, str] = field(default_factory=dict)

    def add_summary_and_nodes(
        self,
        summary_node: BaseNode,
        nodes: List[BaseNode],
    ) -> str: ...

    @property
    def summary_ids(self) -> List[str]: ...

    def delete(self, doc_id: str) -> None: ...

    def delete_nodes(self, node_ids: List[str]) -> None: ...

    @classmethod
    def get_type(cls) -> IndexStructType: ...

Import

from llama_index.core.data_structs.document_summary import IndexDocumentSummary

I/O Contract

Inputs

Name Type Required Description
summary_node BaseNode Yes (for add_summary_and_nodes) The summary node representing a document summary. Must have a non-None ref_doc_id.
nodes List[BaseNode] Yes (for add_summary_and_nodes) The content nodes that the summary covers.
doc_id str Yes (for delete) The document ID (ref_doc_id) of the document to remove.
node_ids List[str] Yes (for delete_nodes) Specific node IDs to remove from the index structure.

Outputs

Name Type Description
return (add_summary_and_nodes) str The summary node ID that was registered.
summary_ids List[str] All summary node IDs in the index structure.
get_type() IndexStructType Returns IndexStructType.DOCUMENT_SUMMARY.

Usage Examples

Basic Usage

from llama_index.core.data_structs.document_summary import IndexDocumentSummary
from llama_index.core.schema import TextNode

# Create the index structure
doc_summary = IndexDocumentSummary()

# Create a summary node with a reference to its source document
summary_node = TextNode(
    text="This document discusses LlamaIndex architecture.",
    id_="summary-1",
)
summary_node.relationships = {
    "1": {"node_id": "doc-1"}  # ref_doc_id
}

# Create content nodes
content_nodes = [
    TextNode(text="LlamaIndex provides data connectors.", id_="node-1"),
    TextNode(text="LlamaIndex supports multiple index types.", id_="node-2"),
]

# Add summary and nodes
summary_id = doc_summary.add_summary_and_nodes(summary_node, content_nodes)
print(f"Summary ID: {summary_id}")
print(f"All summaries: {doc_summary.summary_ids}")

# Delete a document
doc_summary.delete("doc-1")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment