Implementation:Run llama Llama index IndexDocumentSummary
| Knowledge Sources | |
|---|---|
| Domains | LLM Framework, Data Structures, Document Summary |
| Last Updated | 2026-02-11 19:00 GMT |
Overview
IndexDocumentSummary is a data structure class that maintains bidirectional mappings between document summaries and their constituent nodes, serving as the internal storage model for the DocumentSummaryIndex.
Description
The IndexDocumentSummary class extends IndexStruct and maintains three dictionary mappings to enable efficient lookup in both directions between summaries, nodes, and source documents:
- summary_id_to_node_ids -- Maps each summary node ID to the list of content node IDs it summarizes (Dict[str, List[str]]).
- node_id_to_summary_id -- Reverse mapping from each content node ID back to its summary node ID (Dict[str, str]).
- doc_id_to_summary_id -- Maps source document IDs (ref_doc_id) to their summary node IDs (Dict[str, str]).
Key methods:
- add_summary_and_nodes(summary_node, nodes) -- Registers a summary node and its associated content nodes. Extracts the ref_doc_id from the summary node (raises ValueError if it is None). Populates all three mapping dictionaries and returns the summary ID.
- summary_ids (property) -- Returns a list of all summary node IDs.
- delete(doc_id) -- Deletes a document and all associated mappings. Removes the doc-to-summary mapping, all node-to-summary mappings for the document's nodes, and the summary-to-node mapping.
- delete_nodes(node_ids) -- Removes specific nodes from their summary's node list and cleans up the node-to-summary mapping, without deleting the entire document.
- get_type() -- Returns IndexStructType.DOCUMENT_SUMMARY.
Usage
This data structure is used internally by DocumentSummaryIndex to track which nodes belong to which document summaries. It is persisted as part of index storage and is not typically accessed directly by end users.
Code Reference
Source Location
- Repository: Run_llama_Llama_index
- File: llama-index-core/llama_index/core/data_structs/document_summary.py
- Lines: 1-74
Signature
@dataclass
class IndexDocumentSummary(IndexStruct):
summary_id_to_node_ids: Dict[str, List[str]] = field(default_factory=dict)
node_id_to_summary_id: Dict[str, str] = field(default_factory=dict)
doc_id_to_summary_id: Dict[str, str] = field(default_factory=dict)
def add_summary_and_nodes(
self,
summary_node: BaseNode,
nodes: List[BaseNode],
) -> str: ...
@property
def summary_ids(self) -> List[str]: ...
def delete(self, doc_id: str) -> None: ...
def delete_nodes(self, node_ids: List[str]) -> None: ...
@classmethod
def get_type(cls) -> IndexStructType: ...
Import
from llama_index.core.data_structs.document_summary import IndexDocumentSummary
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| summary_node | BaseNode | Yes (for add_summary_and_nodes) | The summary node representing a document summary. Must have a non-None ref_doc_id. |
| nodes | List[BaseNode] | Yes (for add_summary_and_nodes) | The content nodes that the summary covers. |
| doc_id | str | Yes (for delete) | The document ID (ref_doc_id) of the document to remove. |
| node_ids | List[str] | Yes (for delete_nodes) | Specific node IDs to remove from the index structure. |
Outputs
| Name | Type | Description |
|---|---|---|
| return (add_summary_and_nodes) | str | The summary node ID that was registered. |
| summary_ids | List[str] | All summary node IDs in the index structure. |
| get_type() | IndexStructType | Returns IndexStructType.DOCUMENT_SUMMARY. |
Usage Examples
Basic Usage
from llama_index.core.data_structs.document_summary import IndexDocumentSummary
from llama_index.core.schema import TextNode
# Create the index structure
doc_summary = IndexDocumentSummary()
# Create a summary node with a reference to its source document
summary_node = TextNode(
text="This document discusses LlamaIndex architecture.",
id_="summary-1",
)
summary_node.relationships = {
"1": {"node_id": "doc-1"} # ref_doc_id
}
# Create content nodes
content_nodes = [
TextNode(text="LlamaIndex provides data connectors.", id_="node-1"),
TextNode(text="LlamaIndex supports multiple index types.", id_="node-2"),
]
# Add summary and nodes
summary_id = doc_summary.add_summary_and_nodes(summary_node, content_nodes)
print(f"Summary ID: {summary_id}")
print(f"All summaries: {doc_summary.summary_ids}")
# Delete a document
doc_summary.delete("doc-1")