Implementation:AnswerDotAI RAGatouille RAGPretrainedModel Rerank
| Knowledge Sources | |
|---|---|
| Domains | NLP, Information_Retrieval, Reranking |
| Last Updated | 2026-02-12 12:00 GMT |
Overview
Concrete tool for reranking candidate documents using ColBERT late-interaction scoring provided by the RAGatouille library.
Description
The RAGPretrainedModel.rerank() method scores and reorders a set of candidate documents against a query. It delegates to ColBERT.rank() which sets inference max tokens, then calls ColBERT._index_free_retrieve(). This method encodes both query and documents on-the-fly, computes exact MaxSim scores via ColBERT._index_free_search(), and returns sorted results.
The delegation chain:
- RAGPretrainedModel.rerank() → delegates to model.rank()
- ColBERT.rank() → sets max tokens, delegates to _index_free_retrieve()
- ColBERT._index_free_retrieve() → encodes query and docs, calls _index_free_search()
- ColBERT._index_free_search() → computes MaxSim scores, returns top-k
Usage
Use to rerank a set of candidate documents from any first-stage retriever. No index required — documents are encoded on-the-fly.
Code Reference
Source Location
- Repository: RAGatouille
- File: ragatouille/RAGPretrainedModel.py
- Lines: L325-357
Signature
def rerank(
self,
query: Union[str, list[str]],
documents: list[str],
k: int = 10,
zero_index_ranks: bool = False,
bsize: Union[Literal["auto"], int] = "auto",
) -> Union[list[dict], list[list[dict]]]:
"""Encode documents and rerank them in-memory.
Parameters:
query: Query string or list of queries.
documents: Documents to rerank.
k: Number of results per query (default 10).
zero_index_ranks: Zero-based ranking (default False).
bsize: Batch size for encoding ("auto" or int).
Returns:
list[dict] or list[list[dict]]: Reranked results with keys:
content, score, rank, result_index.
"""
Import
from ragatouille import RAGPretrainedModel
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| query | Union[str, list[str]] | Yes | Query string or list of queries |
| documents | list[str] | Yes | Candidate documents to rerank |
| k | int | No | Number of top results to return (default 10) |
| zero_index_ranks | bool | No | Use zero-based ranking (default False, rank 1 = highest) |
| bsize | Union[Literal["auto"], int] | No | Batch size for encoding (default "auto") |
Outputs
| Name | Type | Description |
|---|---|---|
| return (single query) | list[dict] | Results with keys: content (str), score (float), rank (int), result_index (int) |
| return (batch queries) | list[list[dict]] | List of result lists, one per query |
Usage Examples
Basic Reranking
from ragatouille import RAGPretrainedModel
RAG = RAGPretrainedModel.from_pretrained("colbert-ir/colbertv2.0")
# Candidate documents from a first-stage retriever
candidates = [
"ColBERT uses contextualized late interaction.",
"BM25 is a bag-of-words retrieval model.",
"Late interaction computes token-level similarities.",
"TF-IDF weights terms by frequency.",
]
results = RAG.rerank(
query="How does ColBERT work?",
documents=candidates,
k=3,
)
for r in results:
print(f"[{r['rank']}] (score: {r['score']:.4f}) {r['content']}")
Reranking BM25 Results
# After getting BM25 results
bm25_candidates = get_bm25_results(query, k=100)
# Rerank with ColBERT for better precision
reranked = RAG.rerank(
query="What is late interaction?",
documents=bm25_candidates,
k=10,
)
Related Pages
Implements Principle
Requires Environment
- Environment:AnswerDotAI_RAGatouille_Python_ColBERT_Dependencies
- Environment:AnswerDotAI_RAGatouille_GPU_CUDA_Runtime