Principle:Run llama Llama index Index Persistence

Knowledge Sources	LlamaIndex Storage LlamaIndex
Domains	RAG, Data_Management
Last Updated	2026-02-11 00:00 GMT

Overview

A persistence mechanism that serializes and deserializes index state -- including document stores, vector stores, index metadata, and graph stores -- to avoid recomputing embeddings and rebuilding indices from scratch.

Description

Index persistence addresses a critical operational concern in RAG systems: embedding computation is expensive (both in time and API cost), and rebuilding an index from raw documents on every application restart is wasteful. The StorageContext serves as the central coordinator for all storage backends, managing the lifecycle of four distinct stores:

Document Store (docstore): Stores the original document nodes and their metadata, enabling document-level operations like deletion and updates
Index Store (index_store): Stores the structural metadata of indices (e.g., which nodes belong to which index, index type, configuration)
Vector Store(s) (vector_stores): Stores embedding vectors and supports similarity search; can be in-memory or backed by external databases (Pinecone, Chroma, Weaviate, etc.)
Graph Store (graph_store): Stores knowledge graph triples for graph-based index types

The persistence model follows a pluggable backend architecture. Each store has a default in-memory implementation that can be swapped for persistent backends. The persist() method serializes the in-memory stores to disk, while from_defaults(persist_dir=...) reloads them. For external vector stores (e.g., Pinecone), persistence is handled by the external service, and the StorageContext simply maintains the connection.

Usage

Use this principle whenever you need to:

Save an index to disk after building it, so subsequent runs skip the embedding step
Load a pre-built index at application startup for immediate query readiness
Share indices between environments or team members by transferring the storage directory
Use external vector stores by configuring the StorageContext with a remote backend at construction time

Theoretical Basis

Index persistence follows the Repository Pattern applied to RAG storage:

# Abstract algorithm (not real code)
# --- Saving ---
storage_context = index.storage_context
storage_context.persist(persist_dir="./storage")
# Writes: docstore.json, index_store.json, vector_store.json, etc.

# --- Loading ---
storage_context = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage_context)
# Index is fully reconstructed without recomputing embeddings

The serialization format for in-memory stores is JSON-based, with each store writing to a separate file within the persist directory:

Store	Default File	Contents
Document Store	docstore.json	Node text, metadata, relationships, and hashes
Index Store	index_store.json	Index type, configuration, and node-to-index mappings
Vector Store	default__vector_store.json	Embedding vectors and associated node references
Graph Store	graph_store.json	Knowledge graph triples (subject, predicate, object)
Property Graph Store	pg_graph_store.json	Property graph nodes and edges with attributes

The separation of stores enables selective persistence: you might use an external vector store (Pinecone) for embeddings while keeping the document store and index store on local disk. The StorageContext transparently manages this heterogeneous storage topology.

Related Pages

Implemented By

Implementation:Run_llama_Llama_index_StorageContext_Persist

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment