Principle:PacktPublishing LLM Engineers Handbook Query Expansion
| Field | Value |
|---|---|
| Concept | Generating multiple reformulations of a query for improved retrieval recall |
| Category | Retrieval / Query Optimization |
| Workflow | RAG_Inference |
| Repository | PacktPublishing/LLM-Engineers-Handbook |
| Implemented by | Implementation:PacktPublishing_LLM_Engineers_Handbook_QueryExpansion_Generate |
Overview
Query Expansion is a technique that uses an LLM to generate multiple semantically equivalent reformulations of a user query. Each reformulation captures different aspects or phrasings of the original intent, improving recall by matching a wider set of relevant documents in vector search. The original query plus N-1 expansions are all searched in parallel.
Theory
A fundamental challenge in information retrieval is the vocabulary mismatch problem: users often express their information needs using different terms than those used in the relevant documents. Even with semantic embeddings, a single query may not capture all facets of the user's intent.
Query Expansion addresses this by generating multiple reformulations that:
- Use synonyms and paraphrases of key terms
- Emphasize different aspects of the original question
- Vary the level of specificity (broader or narrower formulations)
- Rephrase using domain-specific terminology that may appear in target documents
The expansion process works as follows:
- The original query is sent to an LLM with a prompt requesting N-1 alternative formulations
- The LLM generates diverse reformulations that preserve the original intent
- All N queries (original + expansions) are embedded and searched in parallel
- Results from all queries are aggregated and deduplicated before reranking
This approach increases recall at the cost of additional compute for embedding and searching multiple queries.
When to Use
- When a single query might miss relevant documents due to vocabulary mismatch
- When the document collection uses varied terminology for similar concepts
- When the user's query is ambiguous or could be interpreted in multiple ways
- When recall is more important than minimizing retrieval latency
Related Concepts
- Multi-query retrieval - running multiple queries against the same index
- Query reformulation - rewriting queries for better retrieval performance
- Pseudo-relevance feedback - using initial results to refine queries
- HyDE (Hypothetical Document Embeddings) - generating a hypothetical answer to use as a query