Implementation:Neuml Txtai Reranker
| Knowledge Sources | |
|---|---|
| Domains | Retrieval, Reranking |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Two-stage retrieval pipeline that combines fast embeddings search with precise similarity-based reranking.
Description
The Reranker class implements a classic retrieve-then-rerank strategy. In the first stage, it runs a broad embeddings search to retrieve a large candidate set. In the second stage, it rescores those candidates using a Similarity pipeline (which may use zero-shot classification, cross-encoding, or late interaction) and returns only the top results.
The reranking factor is controlled by the factor parameter: the initial embeddings search retrieves limit * factor candidates (default: 30 for limit=3), giving the similarity pipeline a wide pool to rescore. This two-stage approach balances the speed of approximate nearest neighbor (ANN) search with the precision of cross-encoder or fine-grained similarity models.
The pipeline requires that content storage is enabled on the embeddings instance because the reranker needs to access the original text of retrieved documents to pass them to the similarity pipeline for rescoring. The retrieved results are enriched dictionaries containing "text" fields that are extracted and fed to self.similarity().
Usage
Use this pipeline when retrieval quality from a single-stage embeddings search is insufficient and a more precise scoring model is available. This is particularly effective when combining a fast bi-encoder embeddings index with a slower but more accurate cross-encoder similarity model. The embeddings instance must have content=True enabled.
Code Reference
Source Location
- Repository: txtai
- File:
src/python/txtai/pipeline/text/reranker.py - Lines: L1-57
Class Definition
class Reranker(Pipeline):
"""
Runs embeddings queries and re-ranks them using a similarity pipeline. Note that content must be enabled with the
embeddings instance for this to work properly.
"""
Constructor Signature
def __init__(self, embeddings, similarity):
The constructor accepts two pre-configured instances: an Embeddings instance (with content storage enabled) and a Similarity pipeline instance. These are stored as self.embeddings and self.similarity.
Call Signature
def __call__(self, query, limit=3, factor=10, **kwargs):
Import
from txtai.pipeline import Reranker
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| query | str or list | Yes | Query text or list of query texts. A single string is internally wrapped in a list for batch processing. |
| limit | int | No | Maximum number of final reranked results to return. Defaults to 3.
|
| factor | int | No | Multiplication factor for the initial retrieval stage. The embeddings search retrieves limit * factor candidates for reranking. Defaults to 10.
|
| kwargs | dict | No | Additional keyword arguments passed through to embeddings.batchsearch().
|
Outputs
| Name | Type | Description |
|---|---|---|
| results | list of dict | If query is a string, returns a flat list of result dictionaries (containing "id", "text", "score", etc.) sorted by the reranked score in descending order, limited to limit entries. If query is a list, returns a 2D list with one row of results per query.
|
Internal Flow
The __call__ method proceeds through these steps:
- Normalize input: Wrap a single query string into a list.
- Broad retrieval: Call
self.embeddings.batchsearch(queries, limit * factor, **kwargs)to get a large candidate set. - Extract texts: For each result set, extract the
"text"field from each result dictionary. - Rescore: Pass the query and extracted texts to
self.similarity(queries[x], texts), which returns(uid, score)tuples. - Merge scores: Update each result dictionary's
"score"field with the new similarity score. - Sort and truncate: Sort results by the new score in descending order and take the top
limitresults.
# Core reranking logic (from source)
for x, result in enumerate(results):
texts = [row["text"] for row in result]
for uid, score in self.similarity(queries[x], texts):
result[uid]["score"] = score
ranked.append(sorted(result, key=lambda row: row["score"], reverse=True)[:limit])
Inheritance Chain
Reranker -> Pipeline
The Pipeline base class defines the __call__ interface contract and a batch() helper method.
Usage Examples
Basic Reranking with Cross-Encoder
from txtai.embeddings import Embeddings
from txtai.pipeline import Similarity, Reranker
# Create embeddings with content storage
embeddings = Embeddings({
"path": "sentence-transformers/all-MiniLM-L6-v2",
"content": True
})
# Index documents
embeddings.index([
(0, "Machine learning algorithms for classification", None),
(1, "Deep neural network architectures", None),
(2, "Statistical methods in data analysis", None),
(3, "Natural language processing with transformers", None),
(4, "Computer vision object detection systems", None)
])
# Create a cross-encoder similarity pipeline
similarity = Similarity("cross-encoder/ms-marco-MiniLM-L-6-v2", crossencode=True)
# Create reranker combining fast retrieval with precise scoring
reranker = Reranker(embeddings, similarity)
# Search with reranking - retrieves 30 candidates (3 * 10), rescores, returns top 3
results = reranker("transformer models", limit=3)
for result in results:
print(f"ID: {result['id']}, Score: {result['score']:.4f}, Text: {result['text']}")
Batch Queries with Custom Factor
from txtai.embeddings import Embeddings
from txtai.pipeline import Similarity, Reranker
embeddings = Embeddings({
"path": "sentence-transformers/all-MiniLM-L6-v2",
"content": True
})
# Index a larger dataset
embeddings.index([(i, text, None) for i, text in enumerate(documents)])
similarity = Similarity("cross-encoder/ms-marco-MiniLM-L-6-v2", crossencode=True)
reranker = Reranker(embeddings, similarity)
# Batch queries with a wider candidate pool (factor=20)
results = reranker(
["query one", "query two"],
limit=5,
factor=20
)
for i, row in enumerate(results):
print(f"Query {i}: {len(row)} results")