Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:PacktPublishing LLM Engineers Handbook Query Expansion

From Leeroopedia


Field Value
Concept Generating multiple reformulations of a query for improved retrieval recall
Category Retrieval / Query Optimization
Workflow RAG_Inference
Repository PacktPublishing/LLM-Engineers-Handbook
Implemented by Implementation:PacktPublishing_LLM_Engineers_Handbook_QueryExpansion_Generate

Overview

Query Expansion is a technique that uses an LLM to generate multiple semantically equivalent reformulations of a user query. Each reformulation captures different aspects or phrasings of the original intent, improving recall by matching a wider set of relevant documents in vector search. The original query plus N-1 expansions are all searched in parallel.

Theory

A fundamental challenge in information retrieval is the vocabulary mismatch problem: users often express their information needs using different terms than those used in the relevant documents. Even with semantic embeddings, a single query may not capture all facets of the user's intent.

Query Expansion addresses this by generating multiple reformulations that:

  • Use synonyms and paraphrases of key terms
  • Emphasize different aspects of the original question
  • Vary the level of specificity (broader or narrower formulations)
  • Rephrase using domain-specific terminology that may appear in target documents

The expansion process works as follows:

  1. The original query is sent to an LLM with a prompt requesting N-1 alternative formulations
  2. The LLM generates diverse reformulations that preserve the original intent
  3. All N queries (original + expansions) are embedded and searched in parallel
  4. Results from all queries are aggregated and deduplicated before reranking

This approach increases recall at the cost of additional compute for embedding and searching multiple queries.

When to Use

  • When a single query might miss relevant documents due to vocabulary mismatch
  • When the document collection uses varied terminology for similar concepts
  • When the user's query is ambiguous or could be interpreted in multiple ways
  • When recall is more important than minimizing retrieval latency

Related Concepts

  • Multi-query retrieval - running multiple queries against the same index
  • Query reformulation - rewriting queries for better retrieval performance
  • Pseudo-relevance feedback - using initial results to refine queries
  • HyDE (Hypothetical Document Embeddings) - generating a hypothetical answer to use as a query

See Also

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment