Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:FlagOpen FlagEmbedding Sentence Pair Construction

From Leeroopedia


Field Value
Type Pattern Doc (user-side data formatting patterns)
Source FlagEmbedding/abc/inference/AbsReranker.py:L200-222 (compute_score method)

Interface

Sentence pairs are constructed as Python tuples or lists and passed to reranker.compute_score().

Single Pair

# Using a tuple
sentence_pair = (query_str, passage_str)

# Using a list
sentence_pair = [query_str, passage_str]

Batch of Pairs

# Using a list of tuples
sentence_pairs = [(query1, passage1), (query2, passage2), ...]

# Using a list of lists
sentence_pairs = [[query1, passage1], [query2, passage2], ...]

I/O

Direction Type Description
Input str Raw query and passage strings. No preprocessing required; the reranker handles tokenization and instruction prepending internally.
Output (single pair) Tuple[str, str] or List[str, str] A single pair of (query, passage) strings.
Output (batch) List[Tuple[str, str]] or List[List[str, str]] A list of (query, passage) string pairs.

When passed to compute_score(), the return value is:

Input Format Return Type Description
Single pair float A single relevance score.
Batch of pairs List[float] A list of relevance scores, one per pair.

Examples

Example 1: Single Pair Scoring

from FlagEmbedding import FlagAutoReranker

reranker = FlagAutoReranker.from_finetuned(
    "BAAI/bge-reranker-v2-m3",
    use_fp16=True,
)

# Construct a single sentence pair
query = "What causes earthquakes?"
passage = "Earthquakes are caused by the movement of tectonic plates beneath the Earth's surface."

sentence_pair = (query, passage)

# Compute score for the single pair
score = reranker.compute_score(sentence_pair)
print(f"Relevance score: {score}")
# Output: Relevance score: 0.9876 (example value)

Example 2: Batch Scoring

from FlagEmbedding import FlagAutoReranker

reranker = FlagAutoReranker.from_finetuned(
    "BAAI/bge-reranker-v2-m3",
    use_fp16=True,
)

# Construct a batch of sentence pairs
query = "What is machine learning?"
passages = [
    "Machine learning is a branch of artificial intelligence.",
    "The weather forecast predicts rain tomorrow.",
    "Deep learning uses neural networks with many layers.",
]

sentence_pairs = [(query, passage) for passage in passages]
# Result: [
#   ("What is machine learning?", "Machine learning is a branch of ..."),
#   ("What is machine learning?", "The weather forecast predicts ..."),
#   ("What is machine learning?", "Deep learning uses neural networks ..."),
# ]

scores = reranker.compute_score(sentence_pairs)
print(f"Scores: {scores}")
# Output: Scores: [0.95, 0.02, 0.78] (example values)

# Rank passages by relevance
ranked = sorted(zip(passages, scores), key=lambda x: x[1], reverse=True)
for passage, score in ranked:
    print(f"  {score:.4f}: {passage[:60]}...")

Example 3: Multiple Queries Against Multiple Passages

from FlagEmbedding import FlagAutoReranker

reranker = FlagAutoReranker.from_finetuned(
    "BAAI/bge-reranker-v2-m3",
    use_fp16=True,
)

queries = [
    "What is Python?",
    "How to sort a list?",
]
passages = [
    "Python is a high-level programming language.",
    "You can sort a list using the sorted() function.",
    "Java is an object-oriented language.",
]

# Build all query-passage combinations
sentence_pairs = [
    [query, passage]
    for query in queries
    for passage in passages
]

scores = reranker.compute_score(sentence_pairs)
print(f"Total pairs scored: {len(scores)}")
# Output: Total pairs scored: 6

# Reshape into a matrix for analysis
import numpy as np
score_matrix = np.array(scores).reshape(len(queries), len(passages))
print(score_matrix)

Example 4: With Query and Passage Instructions

from FlagEmbedding import FlagAutoReranker

# Some LLM rerankers support query/passage instructions
reranker = FlagAutoReranker.from_finetuned(
    "BAAI/bge-reranker-v2-gemma",
    model_class="decoder-only-base",
    use_fp16=True,
    query_instruction_for_rerank="A: ",
    passage_instruction_for_rerank="B: ",
)

# Pairs are constructed the same way; instructions are applied internally
sentence_pairs = [
    ("What is gravity?", "Gravity is a fundamental force of nature."),
    ("What is gravity?", "Python is a programming language."),
]

scores = reranker.compute_score(sentence_pairs)
print(scores)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment