Implementation:FlagOpen FlagEmbedding FlagAutoReranker From Finetuned

Field	Value
Type	API Doc
Source	`FlagEmbedding/inference/auto_reranker.py:L23-81`
Import	`from FlagEmbedding import FlagAutoReranker`

Signature

@classmethod
def from_finetuned(
    cls,
    model_name_or_path: str,
    model_class: Optional[Union[str, RerankerModelClass]] = None,
    use_fp16: bool = False,
    trust_remote_code: Optional[bool] = None,
    **kwargs,
) -> AbsReranker:

Parameters

Parameter	Type	Default	Description
model_name_or_path	`str`	(required)	Path to a local model directory or a HuggingFace Hub model name. If the path ends with a `checkpoint-*` directory, the parent directory name is used for auto-detection.
model_class	`Optional[Union[str, RerankerModelClass]]`	`None`	Explicitly specify the reranker type. One of: `"encoder-only-base"`, `"decoder-only-base"`, `"decoder-only-layerwise"`, `"decoder-only-lightweight"`. If `None`, auto-detected from model name via `AUTO_RERANKER_MAPPING`.
use_fp16	`bool`	`False`	If true, use half-precision floating-point to speed up computation with a slight performance degradation.
trust_remote_code	`Optional[bool]`	`None`	Whether to trust remote code for HF models. If `None` and model_class is specified, defaults to `False`. If `None` and auto-detected, uses the value from the model config.
**kwargs			Additional keyword arguments passed to the reranker constructor. See below.

Common kwargs

kwarg	Type	Description
peft_path	`str`	Path to a PEFT adapter for the model.
use_bf16	`bool`	Use bfloat16 precision instead of fp16.
query_instruction_for_rerank	`str`	Instruction prepended to query text.
passage_instruction_for_rerank	`str`	Instruction prepended to passage text.
prompt	`str`	Prompt template for LLM-based rerankers.
cutoff_layers	`List[int]`	Layer indices for layerwise reranker scoring.
compress_layers	`List[int]`	Layer indices for lightweight reranker compression.
compress_ratio	`int`	Token compression ratio for lightweight reranker.
batch_size	`int`	Batch size for inference.
max_length	`int`	Maximum sequence length for tokenization.
normalize	`bool`	Whether to normalize output scores.

Returns

Type	Description
AbsReranker	An instance of the appropriate reranker subclass: `FlagReranker`, `FlagLLMReranker`, `LayerWiseFlagLLMReranker`, or `LightWeightFlagLLMReranker`.

I/O Summary

Input: A model name or path string, optional model class specification, precision flags, and model-specific keyword arguments.

Output: A fully initialized AbsReranker subclass instance ready for inference via .compute_score().

Examples

Example 1: Load an Encoder-Only Reranker (Auto-Detected)

from FlagEmbedding import FlagAutoReranker

# Auto-detects as FlagReranker (encoder-only-base) from model name
reranker = FlagAutoReranker.from_finetuned(
    "BAAI/bge-reranker-v2-m3",
    use_fp16=True,
)

# Score query-passage pairs
scores = reranker.compute_score([
    ("What is the capital of France?", "Paris is the capital of France."),
    ("What is the capital of France?", "Berlin is in Germany."),
])
print(scores)  # e.g., [0.998, 0.012]

Example 2: Load an LLM Reranker with Explicit model_class

from FlagEmbedding import FlagAutoReranker

# Explicitly specify decoder-only-base for an LLM reranker
reranker = FlagAutoReranker.from_finetuned(
    "BAAI/bge-reranker-v2-gemma",
    model_class="decoder-only-base",
    use_fp16=True,
    batch_size=16,
    max_length=512,
)

scores = reranker.compute_score(
    [("query text", "passage text")]
)
print(scores)

Example 3: Load a LayerWise Reranker

from FlagEmbedding import FlagAutoReranker

# Auto-detects as LayerWiseFlagLLMReranker from model name
reranker = FlagAutoReranker.from_finetuned(
    "BAAI/bge-reranker-v2-minicpm-layerwise",
    use_fp16=True,
    cutoff_layers=[28],
)

scores = reranker.compute_score(
    [("What is deep learning?", "Deep learning is a subset of machine learning.")]
)
print(scores)

Example 4: Load a Lightweight Reranker

from FlagEmbedding import FlagAutoReranker

reranker = FlagAutoReranker.from_finetuned(
    "BAAI/bge-reranker-v2.5-gemma2-lightweight",
    model_class="decoder-only-lightweight",
    use_fp16=True,
    compress_ratio=4,
    compress_layers=[24, 40],
)

scores = reranker.compute_score(
    [("search query", "relevant document passage")]
)
print(scores)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment