Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Langgenius Dify Retrieval Configuration

From Leeroopedia


Knowledge Sources
Domains RAG Information Retrieval Search Configuration
Last Updated 2026-02-08 00:00 GMT

Overview

Retrieval configuration defines the strategy for finding and ranking relevant document chunks at query time in a RAG system, encompassing search method selection, reranking, top-k limits, score thresholds, and weighted scoring.

Description

A knowledge base is only as useful as its ability to surface the right content when queried. Retrieval configuration is the set of parameters that controls how the system searches the index and which results are returned to the language model's context window.

The configuration surface includes:

  • Search method -- semantic, full-text, or hybrid search, determining the underlying retrieval algorithm.
  • Reranking -- an optional second-pass model that re-scores initial retrieval results for higher precision.
  • Top-k -- the maximum number of segments returned to the caller.
  • Score threshold -- a minimum relevance score below which results are discarded.
  • Weighted scoring -- when using hybrid search, the relative weights assigned to semantic and keyword scores.

These parameters interact with each other. For example, enabling reranking changes the effective ranking order, which changes which results fall above or below the score threshold. Operators must understand these interactions to tune retrieval effectively.

Usage

Configure retrieval settings when:

  • Setting up a new knowledge base -- default retrieval configuration is applied during creation but can be customized.
  • Tuning retrieval quality -- after observing poor results in hit testing, an operator adjusts top-k, thresholds, or reranking settings.
  • Switching search methods -- moving from semantic to hybrid search (or vice versa) to better match the query patterns of the application.
  • Integrating a reranking model -- enabling a cross-encoder reranker to improve precision at the cost of additional latency.

Theoretical Basis

Search Methods in Depth

Semantic Search

Semantic search relies entirely on vector similarity:

query_vector = embed(query_text)
results = vector_db.search(query_vector, top_k=k)

Strengths: captures meaning, handles paraphrases. Weaknesses: can miss exact keyword matches; sensitive to embedding model quality.

Full-Text Search

Full-text search uses lexical matching with BM25 or similar algorithms:

results = inverted_index.search(tokenize(query_text), top_k=k)

Strengths: fast, deterministic, excellent for exact terms. Weaknesses: no semantic understanding; vocabulary mismatch causes recall loss.

Hybrid Search

Hybrid search combines both signals. The combination can use several fusion strategies:

semantic_results = vector_db.search(query_vector, top_k=k)
keyword_results  = inverted_index.search(tokens, top_k=k)

# Weighted score fusion
for each result in union(semantic_results, keyword_results):
    score = w_semantic * semantic_score + w_keyword * keyword_score

# Or reciprocal rank fusion (RRF)
for each result:
    rrf_score = sum(1 / (rank_constant + rank_in_list) for each list containing result)

Reranking

Reranking is a two-stage retrieval pattern:

Stage 1: Retrieve top-N candidates (N >> k) using fast first-stage retrieval
Stage 2: Score each candidate using a cross-encoder reranking model
Stage 3: Sort by reranker score, return top-k

Cross-encoder rerankers are more accurate than bi-encoder similarity because they attend to the query and document jointly, but they are also much slower (O(N) model calls vs. one embedding call + index lookup). This is why reranking is applied to a small candidate set rather than the entire corpus.

Dify supports two reranking modes:

Mode Mechanism When to Use
Reranking Model A dedicated cross-encoder model (e.g., Cohere Rerank, BGE Reranker) re-scores candidates When maximum precision is needed and latency budget allows
Weighted Score Adjusts the blend of semantic and keyword scores without an external model When a reranking model is unavailable or latency is critical

Weighted Score Presets

For hybrid search with weighted scoring, Dify provides preset weight configurations:

Preset Semantic Weight Keyword Weight Use Case
SemanticFirst 0.7 0.3 Natural-language queries predominate
KeywordFirst 0.3 0.7 Technical or ID-based queries predominate
Customized user-defined user-defined Domain-specific tuning

Top-k and Score Threshold

  • Top-k controls the maximum number of segments returned. Lower values reduce noise but risk missing relevant content. Typical range: 1--10.
  • Score threshold acts as a quality gate. Results below the threshold are discarded even if fewer than k results remain. This prevents low-quality segments from polluting the context window.
candidates = retrieve(query, top_k=k)
if score_threshold_enabled:
    candidates = [c for c in candidates if c.score >= score_threshold]
return candidates

Configuration Interaction Matrix

Setting Affects
search_method Which index types are queried
reranking_enable Whether a second pass re-scores results
reranking_model Which model performs reranking (if enabled)
top_k Upper bound on result count
score_threshold Lower bound on result quality
weighted_score Balance between semantic and keyword signals (hybrid only)

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment