Implementation:Infiniflow Ragflow Dealer Retrieval

Knowledge Sources	RAGFlow
Domains	RAG, Information_Retrieval
Last Updated	2026-02-12 06:00 GMT

Overview

Concrete tool for hybrid vector + keyword retrieval with optional reranking provided by RAGFlow's Dealer class.

Description

The Dealer.retrieval method orchestrates hybrid search across Elasticsearch/Infinity document stores. It embeds the query using the tenant's configured embedding model, performs combined vector and keyword search, applies similarity thresholds, optionally reranks results with a cross-encoder, paginates, and returns chunks with document aggregations.

Usage

Call this for any retrieval operation: chat-time RAG, search application queries, or the retrieval testing interface.

Code Reference

Source Location

Repository: ragflow
File: rag/nlp/search.py
Lines: L362-512

Signature

class Dealer:
    async def retrieval(
        self,
        question: str,
        embd_mdl,
        tenant_ids: str | list[str],
        kb_ids: list[str],
        page: int,
        page_size: int,
        similarity_threshold: float = 0.2,
        vector_similarity_weight: float = 0.3,
        top: int = 1024,
        doc_ids: list[str] | None = None,
        aggs: bool = True,
        rerank_mdl=None,
        highlight: bool = False,
        rank_feature: dict | None = {PAGERANK_FLD: 10},
    ) -> dict:
        """Hybrid retrieval with vector + keyword search.

        Args:
            question: Query text.
            embd_mdl: Embedding model (LLMBundle).
            tenant_ids: Tenant ID(s) for index scoping.
            kb_ids: Knowledge base IDs to search.
            page: Page number (1-indexed).
            page_size: Results per page.
            similarity_threshold: Minimum similarity score (default 0.2).
            vector_similarity_weight: Vector vs keyword weight (default 0.3).
            top: Initial retrieval pool size (default 1024).
            doc_ids: Optional document ID filter.
            aggs: Include document aggregations (default True).
            rerank_mdl: Optional reranking model.
            highlight: Include text highlights (default False).
            rank_feature: Ranking features with weights.

        Returns:
            dict with keys: total (int), chunks (list[dict]), doc_aggs (list[dict])
        """

Import

from rag.nlp.search import Dealer, index_name
# Or access via settings.retriever (pre-initialized Dealer instance)

I/O Contract

Inputs

Name	Type	Required	Description
question	str	Yes	Query text
embd_mdl	LLMBundle	Yes	Embedding model for dense search
tenant_ids	str or list[str]	Yes	Tenant ID(s)
kb_ids	list[str]	Yes	Knowledge base IDs
page	int	Yes	Page number (1-indexed)
page_size	int	Yes	Results per page
similarity_threshold	float	No	Minimum similarity (default 0.2)
vector_similarity_weight	float	No	Vector weight (default 0.3)
top	int	No	Pool size (default 1024)
rerank_mdl	RerankModel or None	No	Optional cross-encoder reranker

Outputs

Name	Type	Description
total	int	Total matching chunks
chunks	list[dict]	Retrieved chunks with chunk_id, content_ltks, doc_id, docnm_kwd, similarity
doc_aggs	list[dict]	Document aggregations with doc_name, doc_id, count

Usage Examples

from common import settings
from api.db.services.llm_service import LLMBundle
from common.constants import LLMType

# Initialize embedding model
embd_mdl = LLMBundle(tenant_id, LLMType.EMBEDDING, llm_name=embd_id)

# Perform hybrid retrieval
results = await settings.retriever.retrieval(
    question="What is the refund policy?",
    embd_mdl=embd_mdl,
    tenant_ids=tenant_id,
    kb_ids=["kb-uuid-1", "kb-uuid-2"],
    page=1,
    page_size=10,
    similarity_threshold=0.2,
    vector_similarity_weight=0.3,
    top=1024
)

for chunk in results["chunks"]:
    print(f"Score: {chunk['similarity']:.3f} - {chunk['content_ltks'][:100]}")

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment