Principle:PacktPublishing LLM Engineers Handbook Self Query Metadata Extraction
| Field | Value |
|---|---|
| Concept | Extracting structured metadata from natural language queries |
| Category | Retrieval / Query Understanding |
| Workflow | RAG_Inference |
| Repository | PacktPublishing/LLM-Engineers-Handbook |
| Implemented by | Implementation:PacktPublishing_LLM_Engineers_Handbook_SelfQuery_Generate |
Overview
Self-Query is a technique that uses an LLM to extract structured metadata (such as author name) from a user's natural language query. This metadata is then used to filter vector search results, improving precision by restricting search to relevant subsets. Self-Query is a form of query understanding and intent extraction that bridges unstructured queries with structured database filters.
Theory
In a typical RAG system, vector similarity search alone may return documents that are semantically similar but contextually irrelevant. For example, a query like "What did Paul Graham write about startups?" contains an implicit metadata constraint: the author must be Paul Graham.
Self-Query addresses this by:
- Parsing the query with an LLM to identify structured fields (e.g., author name, date range, topic category)
- Constructing filters from the extracted metadata that constrain the subsequent vector search
- Preserving the semantic query for embedding and similarity matching
This two-pronged approach combines the strengths of:
- Structured search (exact metadata matching for precision)
- Semantic search (vector similarity for recall)
The LLM acts as a semantic parser, translating natural language constraints into structured filter predicates without requiring the user to use explicit query syntax.
When to Use
- When user queries contain implicit metadata that should filter vector search results
- When the document collection has structured metadata fields (author, date, category) alongside vector embeddings
- When precision matters and unfiltered vector search returns too many irrelevant results
- When building a RAG system over multi-author or multi-source content
Related Concepts
- Semantic parsing - translating natural language into structured representations
- Query understanding - extracting intent and entities from user queries
- Structured search - filtering results using metadata predicates
- Hybrid search - combining keyword, metadata, and vector search strategies