Principle:FlagOpen FlagEmbedding Auto Reranker Loading
| Field | Value |
|---|---|
| Sources | Paper: BGE Reranker, Paper: Layerwise Reranker |
| Domains | NLP, Information_Retrieval |
Overview
A factory-based pattern that automatically detects and instantiates the correct reranker model class based on model name or explicit specification.
Description
Rerankers use cross-attention to score query-passage pairs jointly. FlagEmbedding supports four reranker types:
- BaseReranker (
FlagReranker): Encoder-only cross-encoder model. Maps toencoder-only-base. Used by models such asbge-reranker-base,bge-reranker-large, andbge-reranker-v2-m3. - BaseLLMReranker (
FlagLLMReranker): Decoder-only LLM used as a reranker. Maps todecoder-only-base. Used by models such asbge-reranker-v2-gemma. - LayerWiseLLMReranker (
LayerWiseFlagLLMReranker): Per-layer scoring from a decoder-only LLM. Maps todecoder-only-layerwise. Used by models such asbge-reranker-v2-minicpm-layerwise. - LightweightLLMReranker (
LightWeightFlagLLMReranker): Compressed-token approach for efficient LLM reranking. Maps todecoder-only-lightweight. Used by models such asbge-reranker-v2.5-gemma2-lightweight.
FlagAutoReranker.from_finetuned() uses AUTO_RERANKER_MAPPING to auto-detect model type from the model name. If the model name is not found in the mapping, the user can explicitly specify the model_class parameter. The mapping is defined in FlagEmbedding/inference/reranker/model_mapping.py.
Usage
When loading any BGE reranker model for inference or evaluation. The factory pattern eliminates the need to know which specific reranker class to import and instantiate.
Theoretical Basis
Factory method pattern with model registry. The FlagAutoReranker class cannot be directly instantiated (its __init__ raises EnvironmentError). Instead, users must call the from_finetuned() classmethod, which resolves the correct subclass via a two-level lookup:
- If
model_classis provided, look up directly inRERANKER_CLASS_MAPPING. - Otherwise, look up the model name in
AUTO_RERANKER_MAPPINGto get aRerankerConfigcontaining the model class and trust_remote_code setting.
Cross-encoder reranking concatenates query and passage as a single input, enabling token-level interaction through the model's attention mechanism. This is more accurate than bi-encoder approaches (which encode query and passage independently) but computationally more expensive since each query-passage pair requires a full forward pass.
The class hierarchy is:
AbsReranker (abstract base)
+-- FlagReranker (encoder-only-base)
+-- FlagLLMReranker (decoder-only-base)
+-- LayerWiseFlagLLMReranker (decoder-only-layerwise)
+-- LightWeightFlagLLMReranker (decoder-only-lightweight)