Implementation:Run llama Llama index Index Data Structs
| Knowledge Sources | |
|---|---|
| Domains | LLM Framework, Data Structures, Indexing |
| Last Updated | 2026-02-11 19:00 GMT |
Overview
This module defines the core index data structures used internally by LlamaIndex to represent and manage the underlying storage layout of different index types (tree, list, keyword table, vector store, knowledge graph, etc.).
Description
All data structures in this module are Python dataclasses that inherit from DataClassJsonMixin (for JSON serialization) and the base IndexStruct class. They represent the internal state of an index, not the user-facing index API.
Base class:
- IndexStruct -- Abstract base with an auto-generated index_id (UUID) and an optional summary field. Declares an abstract get_type() class method returning an IndexStructType enum value.
Tree index:
- IndexGraph -- Represents a tree-structured index with three dictionaries:
- all_nodes -- Maps integer positions to node document IDs.
- root_nodes -- Maps integer positions to root node IDs.
- node_id_to_children_ids -- Maps each node ID to its list of child node IDs.
- Provides insert, insert_under_parent, get_children, get_index, and size methods for tree manipulation.
- Returns IndexStructType.TREE.
Keyword table:
- KeywordTable -- Maps keywords (strings) to sets of node IDs. Provides add_node, node_ids, keywords, and size properties. Returns IndexStructType.KEYWORD_TABLE.
List index:
- IndexList -- A simple ordered list of node IDs. Provides add_node to append. Returns IndexStructType.LIST.
LPG (Labeled Property Graph) index:
- IndexLPG -- A minimal struct for the labeled property graph index. The add_node method is a no-op since the LPG index stores data externally. Returns IndexStructType.SIMPLE_LPG.
Vector store indices:
- IndexDict -- Maps vector store IDs to node document IDs. Provides add_node (returns the vector ID) and delete methods. Also maintains legacy doc_id_dict and embeddings_dict fields (deprecated). Returns IndexStructType.VECTOR_STORE.
- MultiModelIndexDict -- Extends IndexDict for multimodal vector stores. Returns IndexStructType.MULTIMODAL_VECTOR_STORE.
Knowledge graph:
- KG -- Stores a keyword-to-node-ID table, a relation map (rel_map, legacy), and an embedding dictionary for triplets. Provides add_node, add_to_embedding_dict, search_node_by_keyword, and node_ids methods. Returns IndexStructType.KG.
Empty index:
- EmptyIndexStruct -- A placeholder struct with no data. Returns IndexStructType.EMPTY.
Legacy alias:
- Node is aliased to TextNode for backward compatibility.
Usage
These data structures are used internally by the index implementations (e.g., TreeIndex, VectorStoreIndex, KeywordTableIndex) to track index state. They are serialized as part of index persistence. Users generally do not interact with these directly unless building custom index implementations.
Code Reference
Source Location
- Repository: Run_llama_Llama_index
- File: llama-index-core/llama_index/core/data_structs/data_structs.py
- Lines: 1-279
Signature
@dataclass
class IndexStruct(DataClassJsonMixin):
index_id: str = field(default_factory=lambda: str(uuid.uuid4()))
summary: Optional[str] = None
def get_summary(self) -> str: ...
@classmethod
@abstractmethod
def get_type(cls) -> IndexStructType: ...
@dataclass
class IndexGraph(IndexStruct): ...
@dataclass
class KeywordTable(IndexStruct): ...
@dataclass
class IndexList(IndexStruct): ...
@dataclass
class IndexLPG(IndexStruct): ...
@dataclass
class IndexDict(IndexStruct): ...
@dataclass
class MultiModelIndexDict(IndexDict): ...
@dataclass
class KG(IndexStruct): ...
@dataclass
class EmptyIndexStruct(IndexStruct): ...
Import
from llama_index.core.data_structs.data_structs import (
IndexStruct,
IndexGraph,
KeywordTable,
IndexList,
IndexLPG,
IndexDict,
MultiModelIndexDict,
KG,
EmptyIndexStruct,
)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| index_id | str | No | Unique identifier for the index struct. Auto-generated UUID if not provided. |
| summary | Optional[str] | No | Optional text summary of the index contents. |
| node | BaseNode | Yes (for add_node methods) | The node to add to the index structure. |
| keywords | List[str] | Yes (for KeywordTable/KG add_node) | Keywords associated with the node being added. |
| text_id | Optional[str] | No (for IndexDict.add_node) | Custom vector store ID; defaults to node's node_id. |
Outputs
| Name | Type | Description |
|---|---|---|
| get_type() | IndexStructType | The enum value identifying the type of index structure. |
| get_summary() | str | The summary text of the index, or raises ValueError if not set. |
| IndexDict.add_node() | str | Returns the vector ID assigned to the added node. |
| size | int | The number of entries in the data structure (available on IndexGraph, KeywordTable). |
Usage Examples
Basic Usage
from llama_index.core.data_structs.data_structs import IndexDict, KeywordTable
from llama_index.core.schema import TextNode
# Vector store index structure
index_dict = IndexDict()
node = TextNode(text="LlamaIndex is a data framework.")
vector_id = index_dict.add_node(node)
print(f"Vector ID: {vector_id}")
print(f"Type: {IndexDict.get_type()}") # IndexStructType.VECTOR_STORE
# Keyword table index structure
kw_table = KeywordTable()
kw_table.add_node(["llama", "framework"], node)
print(f"Keywords: {kw_table.keywords}")
print(f"Size: {kw_table.size}")
Tree Index Structure
from llama_index.core.data_structs.data_structs import IndexGraph
from llama_index.core.schema import TextNode
graph = IndexGraph()
root = TextNode(text="Root summary")
child1 = TextNode(text="Child 1 content")
child2 = TextNode(text="Child 2 content")
graph.insert_under_parent(root, parent_node=None)
graph.insert_under_parent(child1, parent_node=root)
graph.insert_under_parent(child2, parent_node=root)
print(f"Graph size: {graph.size}")
children = graph.get_children(root)
print(f"Root has {len(children)} children")