Overview
Concrete method on BaseIndex that composes a retriever and response synthesizer into a RetrieverQueryEngine for querying indexed data.
Description
The as_query_engine method is the primary entry point for creating query engines from any LlamaIndex index. It delegates to the index's internal as_retriever method to obtain a retriever, then constructs a RetrieverQueryEngine via its from_args classmethod. This classmethod wires together the retriever, an LLM, a response synthesizer (governed by response_mode), optional node postprocessors, and prompt templates into a fully configured query pipeline.
Usage
Call as_query_engine() on any index instance (e.g., VectorStoreIndex, SummaryIndex, KeywordTableIndex) after the index has been built or loaded from storage. Pass configuration kwargs to control retrieval behavior and response synthesis strategy.
Code Reference
Source Location
- Repository: run-llama/llama_index
- File: llama-index-core/llama_index/core/indices/base.py
- Lines: L491-516 (as_query_engine method)
- File: llama-index-core/llama_index/core/query_engine/retriever_query_engine.py
- Lines: L37-128 (RetrieverQueryEngine class and from_args)
Signature
# BaseIndex.as_query_engine
class BaseIndex(Generic[IS]):
def as_query_engine(
self,
llm: Optional[LLMType] = None,
**kwargs,
) -> BaseQueryEngine:
# Resolves retriever via self.as_retriever(**kwargs)
# Delegates to RetrieverQueryEngine.from_args(retriever, llm=llm, **kwargs)
...
# RetrieverQueryEngine.from_args
class RetrieverQueryEngine(BaseQueryEngine):
@classmethod
def from_args(
cls,
retriever: BaseRetriever,
llm: Optional[LLM] = None,
response_synthesizer: Optional[BaseSynthesizer] = None,
node_postprocessors: Optional[List[BaseNodePostprocessor]] = None,
response_mode: ResponseMode = ResponseMode.COMPACT,
text_qa_template: Optional[BasePromptTemplate] = None,
refine_template: Optional[BasePromptTemplate] = None,
summary_template: Optional[BasePromptTemplate] = None,
output_cls: Optional[BaseModel] = None,
use_async: bool = False,
streaming: bool = False,
verbose: bool = False,
**kwargs,
) -> "RetrieverQueryEngine":
...
Import
# as_query_engine is a method on any index instance
from llama_index.core import VectorStoreIndex
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
I/O Contract
Inputs (as_query_engine)
| Name |
Type |
Required |
Description
|
| llm |
LLMType or None |
No |
LLM to use for response synthesis; defaults to Settings.llm
|
| **kwargs |
dict |
No |
Forwarded to both as_retriever() and RetrieverQueryEngine.from_args()
|
Inputs (RetrieverQueryEngine.from_args)
| Name |
Type |
Required |
Description
|
| retriever |
BaseRetriever |
Yes |
Retriever that fetches relevant nodes from the index
|
| llm |
LLM or None |
No |
Language model for synthesis; resolved from Settings if omitted
|
| response_synthesizer |
BaseSynthesizer or None |
No |
Pre-built synthesizer; if provided, overrides response_mode
|
| node_postprocessors |
List[BaseNodePostprocessor] or None |
No |
Post-retrieval processors (re-rankers, filters, etc.)
|
| response_mode |
ResponseMode |
No (default: COMPACT) |
Strategy for synthesizing responses from retrieved nodes
|
| text_qa_template |
BasePromptTemplate or None |
No |
Custom prompt template for QA synthesis
|
| refine_template |
BasePromptTemplate or None |
No |
Custom prompt template for refine iterations
|
| summary_template |
BasePromptTemplate or None |
No |
Custom prompt template for tree summarization
|
| output_cls |
BaseModel or None |
No |
Pydantic model for structured output parsing
|
| use_async |
bool |
No (default: False) |
Whether to use async LLM calls during synthesis
|
| streaming |
bool |
No (default: False) |
Whether to enable streaming token output
|
| verbose |
bool |
No (default: False) |
Whether to print intermediate synthesis steps
|
Outputs
| Name |
Type |
Description
|
| query_engine |
BaseQueryEngine |
Fully configured query engine ready for .query() calls
|
Usage Examples
Basic Query Engine
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
# Create query engine with default settings (compact mode)
query_engine = index.as_query_engine()
Custom Response Mode and LLM
from llama_index.core import VectorStoreIndex
from llama_index.llms.openai import OpenAI
index = VectorStoreIndex.from_documents(documents)
# Use refine mode for thorough multi-pass synthesis
query_engine = index.as_query_engine(
llm=OpenAI(model="gpt-4", temperature=0),
response_mode="refine",
similarity_top_k=5,
streaming=True,
)
With Node Postprocessors
from llama_index.core import VectorStoreIndex
from llama_index.core.postprocessor import SimilarityPostprocessor
index = VectorStoreIndex.from_documents(documents)
# Filter out low-similarity nodes before synthesis
query_engine = index.as_query_engine(
response_mode="tree_summarize",
node_postprocessors=[
SimilarityPostprocessor(similarity_cutoff=0.7),
],
)
Related Pages
Implements Principle