Principle:FlagOpen FlagEmbedding Distributed Reranker Training
| Sources | Domains |
|---|---|
| Paper: BGE Reranker, Paper: Layerwise Reranker | NLP, Distributed_Training |
Overview
A distributed training pipeline that fine-tunes BGE reranker models using cross-encoder contrastive learning with DeepSpeed, supporting encoder-only and decoder-only architectures.
Description
Reranker fine-tuning uses torchrun for multi-GPU training. Three modules are supported:
- encoder-only base -- cross-encoder with classification head
- decoder-only base -- LLM reranker with LoRA
- decoder-only layerwise -- per-layer scoring heads
Reranker training processes (query, passage) pairs jointly, in contrast to embedder training which encodes them separately. Data collation concatenates query and passage with separator tokens.
Usage
When fine-tuning a BGE reranker on custom data.
Theoretical Basis
Cross-encoder loss: binary cross-entropy for encoder-only, language model loss on relevance token for decoder-only. Layerwise training adds classification heads at specified layers.