Heuristic:Infiniflow Ragflow Citation Threshold Decay
| Knowledge Sources | |
|---|---|
| Domains | Retrieval, Optimization |
| Last Updated | 2026-02-12 06:00 GMT |
Overview
Citation matching uses exponential threshold decay from 0.63 to 0.3 (multiplied by 0.8 each iteration) to dynamically find the best similarity cutoff for inserting chunk citations into LLM answers.
Description
When RAGFlow inserts citations into LLM-generated answers, it needs to match answer sentences to source chunks. The system starts with a strict similarity threshold of 0.63 and progressively relaxes it by multiplying by 0.8 each iteration until citations are found or the threshold drops below 0.3. This ensures high-quality citations when possible while still providing citations for answers that paraphrase source material heavily. Each matched answer piece gets up to 4 source citations.
Usage
This heuristic is applied automatically during citation insertion in the `Dealer.insert_citations()` method. Understanding it helps when debugging why certain answers have no citations (the similarity may be below 0.3 even after decay) or why citations seem imprecise (the threshold decayed to a low value).
The Insight (Rule of Thumb)
- Action: Start citation similarity threshold at 0.63, decay by 0.8x per iteration, stop at 0.3 floor.
- Value: Decay sequence: 0.63 → 0.504 → 0.403 → 0.322 → stop (below 0.3).
- Trade-off: Higher initial threshold ensures precise citations. The decay ensures at least some citations appear. Below 0.3, citations would be too imprecise to be useful.
Reasoning
The 0.63 starting threshold was empirically chosen as the point where hybrid similarity (combining token and vector scores) reliably indicates a source-answer match. The 0.8 decay factor provides 4 iterations of relaxation before hitting the 0.3 floor. The 0.99 multiplier on `np.max(sim)` prevents a single dominant match from excluding all others. The limit of 4 citations per answer piece prevents citation overload.
Code Evidence from `rag/nlp/search.py:232-247`:
cites = {}
thr = 0.63
while thr > 0.3 and len(cites.keys()) == 0 and pieces_ and chunks_tks:
for i, a in enumerate(pieces_):
sim, tksim, vtsim = self.qryr.hybrid_similarity(
ans_v[i], chunk_v,
rag_tokenizer.tokenize(self.qryr.rmWWW(pieces_[i])).split(),
chunks_tks, tkweight, vtweight)
mx = np.max(sim) * 0.99
if mx < thr:
continue
cites[idx[i]] = list(
set([str(ii) for ii in range(len(chunk_v)) if sim[ii] > mx]))[:4]
thr *= 0.8