Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:FlagOpen FlagEmbedding Embedding Similarity Scoring

From Leeroopedia


Field Value
Type Pattern Doc (user-side computation patterns)
Source User-side numpy/torch + FlagEmbedding/inference/embedder/encoder_only/m3.py:L129-177 for sparse and ColBERT matching

Interface

Three scoring methods are available after encoding queries and passages with an M3 embedder:

1. Dense Similarity

Direct matrix multiplication of query and passage embeddings:

scores = embeddings_q @ embeddings_p.T

When embeddings are L2-normalized (as returned by the default encode methods), this produces cosine similarity scores.

Parameter Type Description
embeddings_q np.ndarray (shape: [num_queries, dim]) Dense query embeddings from model.encode_queries() or model.encode()
embeddings_p np.ndarray (shape: [num_passages, dim]) Dense passage embeddings from model.encode_corpus() or model.encode()
Returns np.ndarray (shape: [num_queries, num_passages]) Cosine similarity matrix

2. Sparse Lexical Matching

M3Embedder.compute_lexical_matching_score(lexical_weights_1, lexical_weights_2)
Parameter Type Description
lexical_weights_1 Union[Dict[str, float], List[Dict[str, float]]] Lexical weights for queries. Each dict maps tokens to learned weights.
lexical_weights_2 Union[Dict[str, float], List[Dict[str, float]]] Lexical weights for passages. Each dict maps tokens to learned weights.
Returns Union[float, np.ndarray] Single float for dict-dict input; 2D array (shape: [num_queries, num_passages]) for list-list input.

3. ColBERT Token-Level Interaction

M3Embedder.colbert_score(q_reps, p_reps)
Parameter Type Description
q_reps np.ndarray Multi-vector (token-level) embeddings for a single query. Shape: [num_query_tokens, dim].
p_reps np.ndarray Multi-vector (token-level) embeddings for a single passage. Shape: [num_passage_tokens, dim].
Returns torch.Tensor Scalar ColBERT score: average of per-query-token maximum similarities.

I/O

Input: Embeddings produced by AbsEmbedder.encode(), encode_queries(), or encode_corpus(). For M3 models, the encode methods return dictionaries with keys "dense_vecs", "lexical_weights", and "colbert_vecs".

Output: Similarity scores as float (single pair) or np.ndarray (batch). Higher scores indicate greater relevance.

Examples

Example 1: Dense Scoring

import numpy as np
from FlagEmbedding import BGEM3FlagModel

model = BGEM3FlagModel("BAAI/bge-m3", use_fp16=True)

queries = ["What is the capital of France?", "How does photosynthesis work?"]
passages = [
    "Paris is the capital and largest city of France.",
    "Photosynthesis converts light energy into chemical energy in plants.",
    "The Eiffel Tower is located in Paris.",
]

# Encode with dense output
q_embeddings = model.encode(queries)["dense_vecs"]
p_embeddings = model.encode(passages)["dense_vecs"]

# Compute dense similarity (cosine similarity for normalized embeddings)
scores = q_embeddings @ p_embeddings.T
print(scores)
# Output shape: (2, 3) - each query scored against each passage

Example 2: Sparse Lexical Matching

from FlagEmbedding import BGEM3FlagModel

model = BGEM3FlagModel("BAAI/bge-m3", use_fp16=True)

queries = ["What is the capital of France?"]
passages = ["Paris is the capital and largest city of France."]

# Encode with sparse output
q_output = model.encode(queries, return_sparse=True)
p_output = model.encode(passages, return_sparse=True)

q_lexical_weights = q_output["lexical_weights"]
p_lexical_weights = p_output["lexical_weights"]

# Compute sparse lexical matching score
sparse_score = model.compute_lexical_matching_score(
    q_lexical_weights[0], p_lexical_weights[0]
)
print(f"Sparse score: {sparse_score}")
# Returns a single float for dict-dict input

# Batch scoring: pass lists of dicts
sparse_scores = model.compute_lexical_matching_score(
    q_lexical_weights, p_lexical_weights
)
print(f"Sparse scores shape: {sparse_scores.shape}")
# Returns np.ndarray of shape (num_queries, num_passages)

Example 3: ColBERT Scoring

from FlagEmbedding import BGEM3FlagModel

model = BGEM3FlagModel("BAAI/bge-m3", use_fp16=True)

query = "What is the capital of France?"
passage = "Paris is the capital and largest city of France."

# Encode with ColBERT output
q_output = model.encode([query], return_colbert_vecs=True)
p_output = model.encode([passage], return_colbert_vecs=True)

q_colbert_vecs = q_output["colbert_vecs"][0]  # single query token embeddings
p_colbert_vecs = p_output["colbert_vecs"][0]  # single passage token embeddings

# Compute ColBERT score via MaxSim
colbert_score = model.colbert_score(q_colbert_vecs, p_colbert_vecs)
print(f"ColBERT score: {colbert_score.item()}")

Example 4: Combined Multi-Method Scoring

from FlagEmbedding import BGEM3FlagModel

model = BGEM3FlagModel("BAAI/bge-m3", use_fp16=True)

sentence_pairs = [
    ["What is the capital of France?", "Paris is the capital of France."],
    ["How does photosynthesis work?", "Plants convert light to energy."],
]

# compute_score returns all scoring methods combined
scores = model.compute_score(
    sentence_pairs,
    weights_for_different_modes=[1.0, 1.0, 1.0]  # [dense, sparse, colbert]
)
print(scores)
# Returns dict with keys: 'colbert', 'sparse', 'dense', 'sparse+dense', 'colbert+sparse+dense'

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment