Principle:FlagOpen FlagEmbedding Distributed Reranker Training

Sources	Domains
Paper: BGE Reranker, Paper: Layerwise Reranker	NLP, Distributed_Training

Overview

A distributed training pipeline that fine-tunes BGE reranker models using cross-encoder contrastive learning with DeepSpeed, supporting encoder-only and decoder-only architectures.

Description

Reranker fine-tuning uses torchrun for multi-GPU training. Three modules are supported:

encoder-only base -- cross-encoder with classification head
decoder-only base -- LLM reranker with LoRA
decoder-only layerwise -- per-layer scoring heads

Reranker training processes (query, passage) pairs jointly, in contrast to embedder training which encodes them separately. Data collation concatenates query and passage with separator tokens.

Usage

When fine-tuning a BGE reranker on custom data.

Theoretical Basis

Cross-encoder loss: binary cross-entropy for encoder-only, language model loss on relevance token for decoder-only. Layerwise training adds classification heads at specified layers.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment