Principle:AnswerDotAI RAGatouille Document Reranking

Knowledge Sources	ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT Passage Re-ranking with BERT RAGatouille
Domains	NLP, Information_Retrieval, Reranking
Last Updated	2026-02-12 12:00 GMT

Overview

A two-stage retrieval mechanism that re-scores and reorders a set of candidate documents using ColBERT's token-level late interaction, providing more accurate relevance ranking than the initial retrieval stage.

Description

Document Reranking applies ColBERT's fine-grained MaxSim scoring to a pre-selected set of candidate documents. Unlike index-based search which uses approximate scoring via PLAID, reranking encodes both the query and all candidate documents on-the-fly, then computes exact MaxSim scores. This makes it ideal as a second-stage ranker in a retrieve-then-rerank pipeline.

The reranking process:

Both query and candidate documents are encoded into token-level embeddings on-the-fly
Exact MaxSim scores are computed between query and each document
Documents are sorted by score and top-k results are returned
Auto-adjusts document max token length based on the 90th percentile of document lengths
Warns when collections exceed 1000 documents or contain duplicates

Usage

Use this principle when you have a set of candidate documents from a first-stage retriever (BM25, dense retrieval, etc.) and want to improve ranking quality using ColBERT's late interaction. Common scenarios:

Re-ranking BM25 results with ColBERT
Re-ranking dense retriever outputs for higher precision
Scoring a small set of candidate passages for a RAG pipeline

Performance degrades with more documents since all must be encoded.

Theoretical Basis

Reranking computes the exact ColBERT MaxSim score:

$S (q, d) = \sum_{i = 1}^{| q |} \max_{j = 1}^{| d |} E_{q_{i}} \cdot E_{d_{j}}^{T}$

In a two-stage pipeline:

Stage 1 (Retrieval): A fast first-stage retriever (BM25, dense bi-encoder, or PLAID) generates a candidate set of N documents.

Stage 2 (Reranking): ColBERT re-scores all N candidates using full late-interaction, producing a more accurate final ranking. This is computationally expensive (O(N × |q| × |d|)) but much more accurate than first-stage scores.

Related Pages

Implemented By

Implementation:AnswerDotAI_RAGatouille_RAGPretrainedModel_Rerank

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment