Principle:Run llama Llama index Index Persistence
| Knowledge Sources | |
|---|---|
| Domains | RAG, Data_Management |
| Last Updated | 2026-02-11 00:00 GMT |
Overview
A persistence mechanism that serializes and deserializes index state -- including document stores, vector stores, index metadata, and graph stores -- to avoid recomputing embeddings and rebuilding indices from scratch.
Description
Index persistence addresses a critical operational concern in RAG systems: embedding computation is expensive (both in time and API cost), and rebuilding an index from raw documents on every application restart is wasteful. The StorageContext serves as the central coordinator for all storage backends, managing the lifecycle of four distinct stores:
- Document Store (docstore): Stores the original document nodes and their metadata, enabling document-level operations like deletion and updates
- Index Store (index_store): Stores the structural metadata of indices (e.g., which nodes belong to which index, index type, configuration)
- Vector Store(s) (vector_stores): Stores embedding vectors and supports similarity search; can be in-memory or backed by external databases (Pinecone, Chroma, Weaviate, etc.)
- Graph Store (graph_store): Stores knowledge graph triples for graph-based index types
The persistence model follows a pluggable backend architecture. Each store has a default in-memory implementation that can be swapped for persistent backends. The persist() method serializes the in-memory stores to disk, while from_defaults(persist_dir=...) reloads them. For external vector stores (e.g., Pinecone), persistence is handled by the external service, and the StorageContext simply maintains the connection.
Usage
Use this principle whenever you need to:
- Save an index to disk after building it, so subsequent runs skip the embedding step
- Load a pre-built index at application startup for immediate query readiness
- Share indices between environments or team members by transferring the storage directory
- Use external vector stores by configuring the StorageContext with a remote backend at construction time
Theoretical Basis
Index persistence follows the Repository Pattern applied to RAG storage:
# Abstract algorithm (not real code)
# --- Saving ---
storage_context = index.storage_context
storage_context.persist(persist_dir="./storage")
# Writes: docstore.json, index_store.json, vector_store.json, etc.
# --- Loading ---
storage_context = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage_context)
# Index is fully reconstructed without recomputing embeddings
The serialization format for in-memory stores is JSON-based, with each store writing to a separate file within the persist directory:
| Store | Default File | Contents |
|---|---|---|
| Document Store | docstore.json | Node text, metadata, relationships, and hashes |
| Index Store | index_store.json | Index type, configuration, and node-to-index mappings |
| Vector Store | default__vector_store.json | Embedding vectors and associated node references |
| Graph Store | graph_store.json | Knowledge graph triples (subject, predicate, object) |
| Property Graph Store | pg_graph_store.json | Property graph nodes and edges with attributes |
The separation of stores enables selective persistence: you might use an external vector store (Pinecone) for embeddings while keeping the document store and index store on local disk. The StorageContext transparently manages this heterogeneous storage topology.