Implementation:Deepset ai Haystack TransformersSimilarityRanker

Metadata

Field	Value
Implementation Name	TransformersSimilarityRanker
Implementing Principle	Deepset_ai_Haystack_Cross_Encoder_Reranking
Class	`TransformersSimilarityRanker`
Module	`haystack.components.rankers.transformers_similarity`
Source Reference	`haystack/components/rankers/transformers_similarity.py:L24-328`
Repository	Deepset_ai_Haystack
Dependencies	transformers, torch, accelerate

Overview

TransformersSimilarityRanker is a Haystack component that ranks documents by their semantic similarity to a query using a cross-encoder transformer model. It jointly encodes each (query, document) pair and produces a relevance score via a classification head, then returns the documents sorted by descending relevance. This component is designed as a second-stage reranker in retrieval pipelines.

Description

The component loads a cross-encoder model (by default cross-encoder/ms-marco-MiniLM-L-6-v2) using the Hugging Face transformers library. For each query, it constructs (query, document) pairs, tokenizes them, and performs batch inference to produce raw logit scores. These logits can optionally be scaled through a sigmoid function with a configurable calibration factor.

Key behaviors:

Lazy initialization: The model and tokenizer are loaded on the first call to warm_up() or automatically on the first run().
Deduplication: Before ranking, input documents are deduplicated by their id field. If duplicates exist, the one with the highest pre-existing score is retained.
Meta field embedding: Metadata fields specified in meta_fields_to_embed are concatenated with the document content (separated by embedding_separator) before forming the (query, document) pair.
Query and document prefixes: Configurable query_prefix and document_prefix strings are prepended to the query and document text, respectively, supporting models like BGE that require instruction prefixes.
Score calibration: When scale_score=True, raw logits are passed through sigmoid(logit * calibration_factor) to produce scores in the [0, 1] range.
Score threshold filtering: Documents below a configurable score_threshold are excluded from the output.
Batch inference: Documents are processed in batches of configurable size using a PyTorch DataLoader, with inference performed under torch.inference_mode() for efficiency.
Device map support: Uses the accelerate library for Hugging Face device map resolution.

Note: This component is considered legacy by the Haystack maintainers. SentenceTransformersSimilarityRanker is recommended as the replacement, providing the same functionality with additional features.

Code Reference

Import

from haystack.components.rankers import TransformersSimilarityRanker

Constructor Signature

TransformersSimilarityRanker(
    model: str | Path = "cross-encoder/ms-marco-MiniLM-L-6-v2",
    device: ComponentDevice | None = None,
    token: Secret | None = Secret.from_env_var(["HF_API_TOKEN", "HF_TOKEN"], strict=False),
    top_k: int = 10,
    query_prefix: str = "",
    document_prefix: str = "",
    meta_fields_to_embed: list[str] | None = None,
    embedding_separator: str = "\n",
    scale_score: bool = True,
    calibration_factor: float | None = 1.0,
    score_threshold: float | None = None,
    model_kwargs: dict[str, Any] | None = None,
    tokenizer_kwargs: dict[str, Any] | None = None,
    batch_size: int = 16,
)

Parameter	Type	Default	Description
`model`	Path	`"cross-encoder/ms-marco-MiniLM-L-6-v2"`	Hugging Face model ID or local path for the cross-encoder model.
`device`	None	`None`	Device for model loading. Resolved via accelerate device map.
`token`	None	env var	API token for private Hugging Face models.
`top_k`	`int`	`10`	Maximum number of documents to return.
`query_prefix`	`str`	`""`	String prepended to the query before forming pairs.
`document_prefix`	`str`	`""`	String prepended to each document text before forming pairs.
`meta_fields_to_embed`	None	`None`	Metadata fields to concatenate with document content.
`embedding_separator`	`str`	`"\n"`	Separator between metadata fields and document content.
`scale_score`	`bool`	`True`	If True, apply sigmoid calibration to raw logits.
`calibration_factor`	None	`1.0`	Factor for sigmoid calibration: `sigmoid(logit * factor)`. Required when `scale_score=True`.
`score_threshold`	None	`None`	Minimum score for a document to be included in the output.
`model_kwargs`	None	`None`	Additional kwargs for `AutoModelForSequenceClassification.from_pretrained`.
`tokenizer_kwargs`	None	`None`	Additional kwargs for `AutoTokenizer.from_pretrained`.
`batch_size`	`int`	`16`	Batch size for inference. Reduce if encountering memory issues.

I/O Contract

Input

Parameter	Type	Required	Description
`query`	`str`	Yes	The query text to compare documents against.
`documents`	`list[Document]`	Yes	The candidate documents to rank.
`top_k`	None	No	Override the default maximum number of documents to return.
`scale_score`	None	No	Override the default score scaling behavior.
`calibration_factor`	None	No	Override the default calibration factor.
`score_threshold`	None	No	Override the default score threshold.

Output

Key	Type	Description
`documents`	`list[Document]`	Documents sorted by cross-encoder relevance score, from most to least relevant.

The output dictionary has the structure:

{"documents": list[Document]}

Each returned Document has its score field populated with the cross-encoder relevance score. When scale_score=True, scores are in the [0, 1] range. When scale_score=False, scores are raw logits.

Usage Examples

Basic Reranking

from haystack import Document
from haystack.components.rankers import TransformersSimilarityRanker

ranker = TransformersSimilarityRanker()
ranker.warm_up()

docs = [Document(content="Paris"), Document(content="Berlin")]
result = ranker.run(query="City in Germany", documents=docs)

for doc in result["documents"]:
    print(f"{doc.content}: {doc.score:.4f}")
# Berlin: 0.9997
# Paris: 0.0012

Reranking after BM25 Retrieval

from haystack import Document, Pipeline
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.rankers import TransformersSimilarityRanker
from haystack.document_stores.in_memory import InMemoryDocumentStore

doc_store = InMemoryDocumentStore()
doc_store.write_documents([
    Document(content="Berlin is the capital of Germany"),
    Document(content="Paris is known for the Eiffel Tower"),
    Document(content="Germany is a country in central Europe"),
    Document(content="The capital of France is Paris"),
])

pipeline = Pipeline()
pipeline.add_component("retriever", InMemoryBM25Retriever(document_store=doc_store, top_k=10))
pipeline.add_component("ranker", TransformersSimilarityRanker(top_k=3, scale_score=True))
pipeline.connect("retriever.documents", "ranker.documents")

result = pipeline.run({
    "retriever": {"query": "What is the capital of Germany?"},
    "ranker": {"query": "What is the capital of Germany?"},
})
for doc in result["ranker"]["documents"]:
    print(f"{doc.content} (score: {doc.score:.4f})")

Reranking with Score Threshold

from haystack import Document
from haystack.components.rankers import TransformersSimilarityRanker

ranker = TransformersSimilarityRanker(
    model="cross-encoder/ms-marco-MiniLM-L-6-v2",
    top_k=10,
    scale_score=True,
    calibration_factor=1.0,
    score_threshold=0.5,
    batch_size=32,
)
ranker.warm_up()

docs = [
    Document(content="Haystack is an open-source NLP framework"),
    Document(content="The weather is sunny today"),
    Document(content="Building search pipelines with Haystack"),
]

result = ranker.run(query="How to build NLP applications?", documents=docs)
# Only documents with score >= 0.5 are returned
for doc in result["documents"]:
    print(f"{doc.content} (score: {doc.score:.4f})")

Related Pages

Principle: Deepset_ai_Haystack_Cross_Encoder_Reranking -- The principle that this component implements.
Related Implementation: Deepset_ai_Haystack_InMemoryBM25Retriever -- BM25 retriever often used as the first stage before reranking.
Related Implementation: Deepset_ai_Haystack_InMemoryEmbeddingRetriever -- Embedding retriever often used as the first stage before reranking.

Implements Principle

Principle:Deepset_ai_Haystack_Cross_Encoder_Reranking

Requires Environment

Uses Heuristic

Heuristic:Deepset_ai_Haystack_Warning_Deprecated_TransformersSimilarityRanker

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment