Principle:Run llama Llama index Text Chunking

Knowledge Sources	LlamaIndex LlamaIndex Node Parsers
Domains	Data_Preprocessing, RAG, NLP
Last Updated	2026-02-11 00:00 GMT

Overview

Text chunking (also called text splitting) is the process of dividing large documents into smaller, semantically coherent pieces for embedding and retrieval in RAG systems.

Description

Raw documents are typically too long for embedding models and LLM context windows. Text chunking addresses this by splitting documents into nodes (LlamaIndex's term for document chunks) that:

Fit within embedding model token limits
Preserve semantic coherence by splitting at natural boundaries (sentences, paragraphs)
Maintain optional overlap between consecutive chunks to prevent information loss at boundaries

LlamaIndex provides multiple splitting strategies, each with different tradeoffs:

Sentence-aware splitting: Splits at sentence boundaries using NLP tokenizers, preserving complete thoughts. This is the recommended default approach.
Fixed-size splitting: Splits at exact token or character counts regardless of content boundaries. Simpler but may break mid-sentence.
Semantic splitting: Groups sentences by embedding similarity. Higher quality but more expensive.

Usage

Choose a splitting strategy based on your content type and quality requirements. For most use cases, sentence-aware splitting (SentenceSplitter) provides the best balance of quality and performance.

Theoretical Basis

Chunk Size Tradeoffs

The chunk_size parameter controls the maximum size of each chunk. This involves a fundamental tradeoff:

Smaller chunks (128-256 tokens): More precise retrieval but may lose surrounding context. Better for fact-based QA.
Larger chunks (512-1024 tokens): More context per chunk but less precise retrieval. Better for summarization tasks.

Chunk Overlap

The chunk_overlap parameter controls how many tokens are shared between consecutive chunks:

# Conceptual illustration of overlap
# chunk_size=100, chunk_overlap=20

# Chunk 1: tokens[0:100]
# Chunk 2: tokens[80:180]   <- overlaps with chunk 1 by 20 tokens
# Chunk 3: tokens[160:260]  <- overlaps with chunk 2 by 20 tokens

Overlap ensures that information near chunk boundaries is not lost. A typical overlap is 10-20% of chunk size.

Sentence-Aware Splitting Algorithm

Sentence-aware splitters follow a hierarchical approach:

Split text into sentences using an NLP tokenizer
Combine consecutive sentences into chunks up to the chunk_size limit
If a single sentence exceeds chunk_size, fall back to secondary splitting (e.g., by paragraph separator or regex)
Apply overlap by including trailing sentences from the previous chunk

Related Pages

Implemented By

Implementation:Run_llama_Llama_index_SentenceSplitter_Configuration

Uses Heuristic

Heuristic:Run_llama_Llama_index_Chunk_Size_Optimization

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment