Principle:PacktPublishing LLM Engineers Handbook Query Expansion

Field	Value
Concept	Generating multiple reformulations of a query for improved retrieval recall
Category	Retrieval / Query Optimization
Workflow	RAG_Inference
Repository	PacktPublishing/LLM-Engineers-Handbook
Implemented by	Implementation:PacktPublishing_LLM_Engineers_Handbook_QueryExpansion_Generate

Overview

Query Expansion is a technique that uses an LLM to generate multiple semantically equivalent reformulations of a user query. Each reformulation captures different aspects or phrasings of the original intent, improving recall by matching a wider set of relevant documents in vector search. The original query plus N-1 expansions are all searched in parallel.

Theory

A fundamental challenge in information retrieval is the vocabulary mismatch problem: users often express their information needs using different terms than those used in the relevant documents. Even with semantic embeddings, a single query may not capture all facets of the user's intent.

Query Expansion addresses this by generating multiple reformulations that:

Use synonyms and paraphrases of key terms
Emphasize different aspects of the original question
Vary the level of specificity (broader or narrower formulations)
Rephrase using domain-specific terminology that may appear in target documents

The expansion process works as follows:

The original query is sent to an LLM with a prompt requesting N-1 alternative formulations
The LLM generates diverse reformulations that preserve the original intent
All N queries (original + expansions) are embedded and searched in parallel
Results from all queries are aggregated and deduplicated before reranking

This approach increases recall at the cost of additional compute for embedding and searching multiple queries.

When to Use

When a single query might miss relevant documents due to vocabulary mismatch
When the document collection uses varied terminology for similar concepts
When the user's query is ambiguous or could be interpreted in multiple ways
When recall is more important than minimizing retrieval latency

Related Concepts

Multi-query retrieval - running multiple queries against the same index
Query reformulation - rewriting queries for better retrieval performance
Pseudo-relevance feedback - using initial results to refine queries
HyDE (Hypothetical Document Embeddings) - generating a hypothetical answer to use as a query

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment