Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:FlagOpen FlagEmbedding Cross Encoder Reranking

From Leeroopedia


Template:Metadata

Overview

A technique that uses cross-attention between query and passage tokens to compute fine-grained relevance scores, providing higher accuracy than bi-encoder similarity.

Description

Cross-encoder reranking concatenates query and passage into a single sequence, allowing full token-level attention between them. This captures subtle semantic relationships that bi-encoders miss. FlagEmbedding supports four reranker architectures:

  1. Encoder-only using sequence classification heads
  2. Decoder-only LLM-based using next-token prediction
  3. Layerwise extracting scores from multiple transformer layers
  4. Lightweight with token compression

Multi-GPU support via process pools for batch scoring.

Usage

When re-scoring candidate passages retrieved by a first-stage bi-encoder to improve ranking quality.

Theoretical Basis

Cross-attention allows O(n*m) token interactions vs O(n+m) for bi-encoders. The score is computed as:

  • Encoder-only: sigmoid(cls_logit)
  • Decoder-only: P("Yes"|[query, passage, prompt])
  • Layerwise: mean(layer_scores[cutoff_layers])
  • Lightweight: score(compressed_passage_tokens)

Related Pages

Implementation:FlagOpen_FlagEmbedding_AbsReranker_Compute_Score

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment