Implementation:FlagOpen FlagEmbedding Embedding Similarity Scoring

Field	Value
Type	Pattern Doc (user-side computation patterns)
Source	User-side numpy/torch + `FlagEmbedding/inference/embedder/encoder_only/m3.py:L129-177` for sparse and ColBERT matching

Interface

Three scoring methods are available after encoding queries and passages with an M3 embedder:

1. Dense Similarity

Direct matrix multiplication of query and passage embeddings:

scores = embeddings_q @ embeddings_p.T

When embeddings are L2-normalized (as returned by the default encode methods), this produces cosine similarity scores.

Parameter	Type	Description
embeddings_q	`np.ndarray` (shape: [num_queries, dim])	Dense query embeddings from `model.encode_queries()` or `model.encode()`
embeddings_p	`np.ndarray` (shape: [num_passages, dim])	Dense passage embeddings from `model.encode_corpus()` or `model.encode()`
Returns	`np.ndarray` (shape: [num_queries, num_passages])	Cosine similarity matrix

2. Sparse Lexical Matching

M3Embedder.compute_lexical_matching_score(lexical_weights_1, lexical_weights_2)

Parameter	Type	Description
lexical_weights_1	`Union[Dict[str, float], List[Dict[str, float]]]`	Lexical weights for queries. Each dict maps tokens to learned weights.
lexical_weights_2	`Union[Dict[str, float], List[Dict[str, float]]]`	Lexical weights for passages. Each dict maps tokens to learned weights.
Returns	`Union[float, np.ndarray]`	Single float for dict-dict input; 2D array (shape: [num_queries, num_passages]) for list-list input.

3. ColBERT Token-Level Interaction

M3Embedder.colbert_score(q_reps, p_reps)

Parameter	Type	Description
q_reps	`np.ndarray`	Multi-vector (token-level) embeddings for a single query. Shape: [num_query_tokens, dim].
p_reps	`np.ndarray`	Multi-vector (token-level) embeddings for a single passage. Shape: [num_passage_tokens, dim].
Returns	`torch.Tensor`	Scalar ColBERT score: average of per-query-token maximum similarities.

I/O

Input: Embeddings produced by AbsEmbedder.encode(), encode_queries(), or encode_corpus(). For M3 models, the encode methods return dictionaries with keys "dense_vecs", "lexical_weights", and "colbert_vecs".

Output: Similarity scores as float (single pair) or np.ndarray (batch). Higher scores indicate greater relevance.

Examples

Example 1: Dense Scoring

import numpy as np
from FlagEmbedding import BGEM3FlagModel

model = BGEM3FlagModel("BAAI/bge-m3", use_fp16=True)

queries = ["What is the capital of France?", "How does photosynthesis work?"]
passages = [
    "Paris is the capital and largest city of France.",
    "Photosynthesis converts light energy into chemical energy in plants.",
    "The Eiffel Tower is located in Paris.",
]

# Encode with dense output
q_embeddings = model.encode(queries)["dense_vecs"]
p_embeddings = model.encode(passages)["dense_vecs"]

# Compute dense similarity (cosine similarity for normalized embeddings)
scores = q_embeddings @ p_embeddings.T
print(scores)
# Output shape: (2, 3) - each query scored against each passage

Example 2: Sparse Lexical Matching

from FlagEmbedding import BGEM3FlagModel

model = BGEM3FlagModel("BAAI/bge-m3", use_fp16=True)

queries = ["What is the capital of France?"]
passages = ["Paris is the capital and largest city of France."]

# Encode with sparse output
q_output = model.encode(queries, return_sparse=True)
p_output = model.encode(passages, return_sparse=True)

q_lexical_weights = q_output["lexical_weights"]
p_lexical_weights = p_output["lexical_weights"]

# Compute sparse lexical matching score
sparse_score = model.compute_lexical_matching_score(
    q_lexical_weights[0], p_lexical_weights[0]
)
print(f"Sparse score: {sparse_score}")
# Returns a single float for dict-dict input

# Batch scoring: pass lists of dicts
sparse_scores = model.compute_lexical_matching_score(
    q_lexical_weights, p_lexical_weights
)
print(f"Sparse scores shape: {sparse_scores.shape}")
# Returns np.ndarray of shape (num_queries, num_passages)

Example 3: ColBERT Scoring

from FlagEmbedding import BGEM3FlagModel

model = BGEM3FlagModel("BAAI/bge-m3", use_fp16=True)

query = "What is the capital of France?"
passage = "Paris is the capital and largest city of France."

# Encode with ColBERT output
q_output = model.encode([query], return_colbert_vecs=True)
p_output = model.encode([passage], return_colbert_vecs=True)

q_colbert_vecs = q_output["colbert_vecs"][0]  # single query token embeddings
p_colbert_vecs = p_output["colbert_vecs"][0]  # single passage token embeddings

# Compute ColBERT score via MaxSim
colbert_score = model.colbert_score(q_colbert_vecs, p_colbert_vecs)
print(f"ColBERT score: {colbert_score.item()}")

Example 4: Combined Multi-Method Scoring

from FlagEmbedding import BGEM3FlagModel

model = BGEM3FlagModel("BAAI/bge-m3", use_fp16=True)

sentence_pairs = [
    ["What is the capital of France?", "Paris is the capital of France."],
    ["How does photosynthesis work?", "Plants convert light to energy."],
]

# compute_score returns all scoring methods combined
scores = model.compute_score(
    sentence_pairs,
    weights_for_different_modes=[1.0, 1.0, 1.0]  # [dense, sparse, colbert]
)
print(scores)
# Returns dict with keys: 'colbert', 'sparse', 'dense', 'sparse+dense', 'colbert+sparse+dense'

Related Pages

Principle:FlagOpen_FlagEmbedding_Embedding_Similarity_Computation

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment