Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Run llama Llama index Index Data Structs

From Leeroopedia
Knowledge Sources
Domains LLM Framework, Data Structures, Indexing
Last Updated 2026-02-11 19:00 GMT

Overview

This module defines the core index data structures used internally by LlamaIndex to represent and manage the underlying storage layout of different index types (tree, list, keyword table, vector store, knowledge graph, etc.).

Description

All data structures in this module are Python dataclasses that inherit from DataClassJsonMixin (for JSON serialization) and the base IndexStruct class. They represent the internal state of an index, not the user-facing index API.

Base class:

  • IndexStruct -- Abstract base with an auto-generated index_id (UUID) and an optional summary field. Declares an abstract get_type() class method returning an IndexStructType enum value.

Tree index:

  • IndexGraph -- Represents a tree-structured index with three dictionaries:
    • all_nodes -- Maps integer positions to node document IDs.
    • root_nodes -- Maps integer positions to root node IDs.
    • node_id_to_children_ids -- Maps each node ID to its list of child node IDs.
    • Provides insert, insert_under_parent, get_children, get_index, and size methods for tree manipulation.
    • Returns IndexStructType.TREE.

Keyword table:

  • KeywordTable -- Maps keywords (strings) to sets of node IDs. Provides add_node, node_ids, keywords, and size properties. Returns IndexStructType.KEYWORD_TABLE.

List index:

  • IndexList -- A simple ordered list of node IDs. Provides add_node to append. Returns IndexStructType.LIST.

LPG (Labeled Property Graph) index:

  • IndexLPG -- A minimal struct for the labeled property graph index. The add_node method is a no-op since the LPG index stores data externally. Returns IndexStructType.SIMPLE_LPG.

Vector store indices:

  • IndexDict -- Maps vector store IDs to node document IDs. Provides add_node (returns the vector ID) and delete methods. Also maintains legacy doc_id_dict and embeddings_dict fields (deprecated). Returns IndexStructType.VECTOR_STORE.
  • MultiModelIndexDict -- Extends IndexDict for multimodal vector stores. Returns IndexStructType.MULTIMODAL_VECTOR_STORE.

Knowledge graph:

  • KG -- Stores a keyword-to-node-ID table, a relation map (rel_map, legacy), and an embedding dictionary for triplets. Provides add_node, add_to_embedding_dict, search_node_by_keyword, and node_ids methods. Returns IndexStructType.KG.

Empty index:

  • EmptyIndexStruct -- A placeholder struct with no data. Returns IndexStructType.EMPTY.

Legacy alias:

  • Node is aliased to TextNode for backward compatibility.

Usage

These data structures are used internally by the index implementations (e.g., TreeIndex, VectorStoreIndex, KeywordTableIndex) to track index state. They are serialized as part of index persistence. Users generally do not interact with these directly unless building custom index implementations.

Code Reference

Source Location

  • Repository: Run_llama_Llama_index
  • File: llama-index-core/llama_index/core/data_structs/data_structs.py
  • Lines: 1-279

Signature

@dataclass
class IndexStruct(DataClassJsonMixin):
    index_id: str = field(default_factory=lambda: str(uuid.uuid4()))
    summary: Optional[str] = None
    def get_summary(self) -> str: ...
    @classmethod
    @abstractmethod
    def get_type(cls) -> IndexStructType: ...

@dataclass
class IndexGraph(IndexStruct): ...

@dataclass
class KeywordTable(IndexStruct): ...

@dataclass
class IndexList(IndexStruct): ...

@dataclass
class IndexLPG(IndexStruct): ...

@dataclass
class IndexDict(IndexStruct): ...

@dataclass
class MultiModelIndexDict(IndexDict): ...

@dataclass
class KG(IndexStruct): ...

@dataclass
class EmptyIndexStruct(IndexStruct): ...

Import

from llama_index.core.data_structs.data_structs import (
    IndexStruct,
    IndexGraph,
    KeywordTable,
    IndexList,
    IndexLPG,
    IndexDict,
    MultiModelIndexDict,
    KG,
    EmptyIndexStruct,
)

I/O Contract

Inputs

Name Type Required Description
index_id str No Unique identifier for the index struct. Auto-generated UUID if not provided.
summary Optional[str] No Optional text summary of the index contents.
node BaseNode Yes (for add_node methods) The node to add to the index structure.
keywords List[str] Yes (for KeywordTable/KG add_node) Keywords associated with the node being added.
text_id Optional[str] No (for IndexDict.add_node) Custom vector store ID; defaults to node's node_id.

Outputs

Name Type Description
get_type() IndexStructType The enum value identifying the type of index structure.
get_summary() str The summary text of the index, or raises ValueError if not set.
IndexDict.add_node() str Returns the vector ID assigned to the added node.
size int The number of entries in the data structure (available on IndexGraph, KeywordTable).

Usage Examples

Basic Usage

from llama_index.core.data_structs.data_structs import IndexDict, KeywordTable
from llama_index.core.schema import TextNode

# Vector store index structure
index_dict = IndexDict()
node = TextNode(text="LlamaIndex is a data framework.")
vector_id = index_dict.add_node(node)
print(f"Vector ID: {vector_id}")
print(f"Type: {IndexDict.get_type()}")  # IndexStructType.VECTOR_STORE

# Keyword table index structure
kw_table = KeywordTable()
kw_table.add_node(["llama", "framework"], node)
print(f"Keywords: {kw_table.keywords}")
print(f"Size: {kw_table.size}")

Tree Index Structure

from llama_index.core.data_structs.data_structs import IndexGraph
from llama_index.core.schema import TextNode

graph = IndexGraph()
root = TextNode(text="Root summary")
child1 = TextNode(text="Child 1 content")
child2 = TextNode(text="Child 2 content")

graph.insert_under_parent(root, parent_node=None)
graph.insert_under_parent(child1, parent_node=root)
graph.insert_under_parent(child2, parent_node=root)

print(f"Graph size: {graph.size}")
children = graph.get_children(root)
print(f"Root has {len(children)} children")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment