Implementation:Vibrantlabsai Ragas SemanticSimilarityV2

Knowledge Sources	Vibrantlabsai_Ragas
Domains	Evaluation, Metrics
Last Updated	2026-02-12 00:00 GMT

Overview

SemanticSimilarity is a class-based v2 metric that evaluates the semantic similarity between reference and response texts by computing the cosine similarity of their embedding vectors.

Description

The SemanticSimilarity metric measures how semantically close a generated response is to a reference text using vector embeddings. It inherits from BaseMetric and requires a BaseRagasEmbedding instance to function.

The algorithm is based on the Semantic Answer Similarity (SAS) approach described in this paper and works as follows:

Both the reference and response texts are embedded using the provided embeddings model via the embed_text() method.
The resulting embedding vectors are converted to NumPy arrays.
Each embedding is L2-normalized (divided by its Euclidean norm).
The cosine similarity is computed as the dot product of the two normalized vectors: embedding_1_normalized @ embedding_2_normalized.T.
The resulting similarity value is flattened to a scalar.

An optional threshold parameter enables binary classification: when set, any similarity score at or above the threshold returns 1.0 (similar), and any score below returns 0.0 (dissimilar). When threshold is None (the default), the raw cosine similarity value is returned.

Empty or None inputs are replaced with a single space character to prevent embedding errors.

Usage

Use SemanticSimilarity when you need to evaluate whether a generated response captures the same meaning as a reference text, regardless of exact wording. This is useful for evaluating paraphrasing quality, answer correctness in question-answering systems, and general text generation fidelity. Unlike lexical metrics (BLEU, ROUGE), this metric captures semantic equivalence even when different words or phrasings are used. It requires an embeddings model to be configured.

Code Reference

Source Location

Repository: Vibrantlabsai_Ragas
File: src/ragas/metrics/collections/_semantic_similarity.py

Signature

class SemanticSimilarity(BaseMetric):
    embeddings: "BaseRagasEmbedding"

    def __init__(
        self,
        embeddings: "BaseRagasEmbedding",
        name: str = "semantic_similarity",
        threshold: t.Optional[float] = None,
        **kwargs,
    ):

Import

from ragas.metrics.collections import SemanticSimilarity

I/O Contract

Inputs

Name	Type	Required	Description
embeddings	BaseRagasEmbedding	Yes	An embeddings model instance with an `embed_text()` method (validated at initialization)
reference	str	Yes	The reference/ground truth text
response	str	Yes	The response text to evaluate against the reference
threshold	float	No	Optional threshold for binary classification. When set, scores >= threshold return 1.0, otherwise 0.0 (default: None)

Outputs

Name	Type	Description
result	MetricResult	A MetricResult object with a `value` attribute containing the cosine similarity score between 0.0 and 1.0 (or binary 0.0/1.0 if threshold is set)

Usage Examples

Basic Usage

from openai import AsyncOpenAI
from ragas.embeddings.base import embedding_factory
from ragas.metrics.collections import SemanticSimilarity

# Setup embeddings
client = AsyncOpenAI()
embeddings = embedding_factory(
    "openai",
    model="text-embedding-ada-002",
    client=client,
    interface="modern"
)

# Create metric instance
metric = SemanticSimilarity(embeddings=embeddings)

# Evaluate semantic similarity
result = await metric.ascore(
    reference="Paris is the capital of France.",
    response="The capital of France is Paris."
)
print(f"Semantic Similarity: {result.value}")

Binary Classification with Threshold

from ragas.metrics.collections import SemanticSimilarity

# Use threshold for binary pass/fail classification
metric = SemanticSimilarity(embeddings=embeddings, threshold=0.8)

result = await metric.ascore(
    reference="The weather is sunny today.",
    response="It is a bright and sunny day."
)
print(f"Similar (>= 0.8): {result.value}")  # 1.0 or 0.0

Batch Evaluation

from ragas.metrics.collections import SemanticSimilarity

metric = SemanticSimilarity(embeddings=embeddings)

results = await metric.abatch_score([
    {"reference": "Machine learning is a subset of AI.",
     "response": "ML is part of artificial intelligence."},
    {"reference": "The sky is blue.",
     "response": "Water boils at 100 degrees Celsius."},
])

for i, result in enumerate(results):
    print(f"Sample {i}: Similarity = {result.value}")

Related Pages

Environment:Vibrantlabsai_Ragas_Python_3_9_Core_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment