Principle:FlagOpen FlagEmbedding Query Passage Pair Formatting

Field	Value
Sources	Repo source: `FlagEmbedding/abc/inference/AbsReranker.py`
Domains	Information_Retrieval, NLP

Overview

A data formatting pattern for structuring query-passage pairs as input to cross-encoder reranking models.

Description

Rerankers require paired inputs of (query, passage) for cross-attention scoring. The format is either a single tuple (query, passage) for single-pair scoring, or a list of tuples for batch scoring.

The supported input formats are:

Single pair: A tuple or list of two strings: (query_str, passage_str) or [query_str, passage_str].
Batch of pairs: A list of tuples or lists: [(query1, passage1), (query2, passage2), ...].

The reranker automatically applies query and passage instructions if configured via query_instruction_for_rerank and passage_instruction_for_rerank parameters. When instructions are set, the get_detailed_inputs() method prepends them to the respective texts before scoring.

The compute_score() method in AbsReranker detects whether a single pair or batch was provided by checking if the first element is a string (single pair) or a sequence (batch). Single pairs are automatically wrapped in a list for uniform processing.

Usage

Before calling reranker.compute_score(). All reranker types (encoder-only, decoder-only, layerwise, lightweight) accept the same pair format.

Theoretical Basis

Cross-encoder models process query and passage jointly as a single concatenated input. This enables full token-level interaction through the model's self-attention mechanism, in contrast to bi-encoders which encode query and passage independently.

Input pairs must be ordered (query first, passage second) for correct attention patterns. The model internally tokenizes the concatenated pair with appropriate separator tokens (e.g., [SEP] for encoder-only models, or instruction-formatted prompts for LLM-based models).

The joint encoding allows the model to directly compare query tokens against passage tokens, producing more accurate relevance judgments at the cost of requiring a separate forward pass for each query-passage pair.

Related Pages

Implementation:FlagOpen_FlagEmbedding_Sentence_Pair_Construction

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment