Implementation:Infiniflow Ragflow Dealer Retrieval
| Knowledge Sources | |
|---|---|
| Domains | RAG, Information_Retrieval |
| Last Updated | 2026-02-12 06:00 GMT |
Overview
Concrete tool for hybrid vector + keyword retrieval with optional reranking provided by RAGFlow's Dealer class.
Description
The Dealer.retrieval method orchestrates hybrid search across Elasticsearch/Infinity document stores. It embeds the query using the tenant's configured embedding model, performs combined vector and keyword search, applies similarity thresholds, optionally reranks results with a cross-encoder, paginates, and returns chunks with document aggregations.
Usage
Call this for any retrieval operation: chat-time RAG, search application queries, or the retrieval testing interface.
Code Reference
Source Location
- Repository: ragflow
- File: rag/nlp/search.py
- Lines: L362-512
Signature
class Dealer:
async def retrieval(
self,
question: str,
embd_mdl,
tenant_ids: str | list[str],
kb_ids: list[str],
page: int,
page_size: int,
similarity_threshold: float = 0.2,
vector_similarity_weight: float = 0.3,
top: int = 1024,
doc_ids: list[str] | None = None,
aggs: bool = True,
rerank_mdl=None,
highlight: bool = False,
rank_feature: dict | None = {PAGERANK_FLD: 10},
) -> dict:
"""Hybrid retrieval with vector + keyword search.
Args:
question: Query text.
embd_mdl: Embedding model (LLMBundle).
tenant_ids: Tenant ID(s) for index scoping.
kb_ids: Knowledge base IDs to search.
page: Page number (1-indexed).
page_size: Results per page.
similarity_threshold: Minimum similarity score (default 0.2).
vector_similarity_weight: Vector vs keyword weight (default 0.3).
top: Initial retrieval pool size (default 1024).
doc_ids: Optional document ID filter.
aggs: Include document aggregations (default True).
rerank_mdl: Optional reranking model.
highlight: Include text highlights (default False).
rank_feature: Ranking features with weights.
Returns:
dict with keys: total (int), chunks (list[dict]), doc_aggs (list[dict])
"""
Import
from rag.nlp.search import Dealer, index_name
# Or access via settings.retriever (pre-initialized Dealer instance)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| question | str | Yes | Query text |
| embd_mdl | LLMBundle | Yes | Embedding model for dense search |
| tenant_ids | str or list[str] | Yes | Tenant ID(s) |
| kb_ids | list[str] | Yes | Knowledge base IDs |
| page | int | Yes | Page number (1-indexed) |
| page_size | int | Yes | Results per page |
| similarity_threshold | float | No | Minimum similarity (default 0.2) |
| vector_similarity_weight | float | No | Vector weight (default 0.3) |
| top | int | No | Pool size (default 1024) |
| rerank_mdl | RerankModel or None | No | Optional cross-encoder reranker |
Outputs
| Name | Type | Description |
|---|---|---|
| total | int | Total matching chunks |
| chunks | list[dict] | Retrieved chunks with chunk_id, content_ltks, doc_id, docnm_kwd, similarity |
| doc_aggs | list[dict] | Document aggregations with doc_name, doc_id, count |
Usage Examples
from common import settings
from api.db.services.llm_service import LLMBundle
from common.constants import LLMType
# Initialize embedding model
embd_mdl = LLMBundle(tenant_id, LLMType.EMBEDDING, llm_name=embd_id)
# Perform hybrid retrieval
results = await settings.retriever.retrieval(
question="What is the refund policy?",
embd_mdl=embd_mdl,
tenant_ids=tenant_id,
kb_ids=["kb-uuid-1", "kb-uuid-2"],
page=1,
page_size=10,
similarity_threshold=0.2,
vector_similarity_weight=0.3,
top=1024
)
for chunk in results["chunks"]:
print(f"Score: {chunk['similarity']:.3f} - {chunk['content_ltks'][:100]}")