Implementation:Explodinggradients Ragas Collections AnswerRelevancy Metric

Field	Value
source	Repo
domains	Metrics, Evaluation
last_updated	2026-02-10 00:00 GMT

Overview

AnswerRelevancy is a v2 class-based metric that evaluates how relevant a response is to the original question using a dual-component design combining LLM-based question generation with embedding cosine similarity.

Description

AnswerRelevancy extends BaseMetric and requires both a modern InstructorBaseRagasLLM and a BaseRagasEmbedding. The evaluation algorithm works as follows:

For each iteration up to strictness count (default 3), the LLM generates a synthetic question from the response along with a noncommittal flag indicating if the response is evasive.
The original question and all generated questions are embedded using the embeddings model.
Cosine similarity is computed between the original question vector and each generated question vector.
The final score is the mean cosine similarity, reduced to 0.0 if all generated responses were flagged as noncommittal.

This approach measures whether the response actually addresses the question -- a relevant answer should allow reconstruction of the original question. The metric uses structured prompts via AnswerRelevancePrompt with AnswerRelevanceInput / AnswerRelevanceOutput Pydantic models.

Usage

Instantiate with required llm and embeddings parameters and optional strictness (number of generated questions). Call ascore(user_input, response). Both components are validated by the base class to ensure they are modern implementations.

Code Reference

Property	Value
Source Location	`src/ragas/metrics/collections/answer_relevancy/metric.py` L1--157
Signature	`class AnswerRelevancy(BaseMetric)`
Import	`from ragas.metrics.collections import AnswerRelevancy`

I/O Contract

Inputs

Parameter	Type	Required	Description
`user_input`	`str`	Yes	The original question
`response`	`str`	Yes	The response text to evaluate

Constructor Parameters

Parameter	Type	Default	Description
`llm`	`InstructorBaseRagasLLM`	(required)	Modern instructor-based LLM for question generation
`embeddings`	`BaseRagasEmbedding`	(required)	Modern embeddings model for semantic comparison
`name`	`str`	`"answer_relevancy"`	Metric name
`strictness`	`int`	`3`	Number of questions to generate per evaluation

Outputs

Field	Type	Description
`MetricResult.value`	`float`	Answer relevancy score in range 0.0--1.0 (higher is better)

Usage Examples

import openai
from ragas.llms.base import llm_factory
from ragas.embeddings.base import embedding_factory
from ragas.metrics.collections import AnswerRelevancy

# Setup dependencies
client = openai.AsyncOpenAI()
llm = llm_factory("gpt-4o-mini", client=client)
embeddings = embedding_factory(
    "openai", model="text-embedding-ada-002",
    client=client, interface="modern"
)

# Create metric
metric = AnswerRelevancy(llm=llm, embeddings=embeddings, strictness=3)

# Single evaluation
result = await metric.ascore(
    user_input="What is the capital of France?",
    response="Paris is the capital of France."
)
print(f"Answer Relevancy: {result.value}")

# Higher strictness for more robust evaluation
strict_metric = AnswerRelevancy(llm=llm, embeddings=embeddings, strictness=5)
result = await strict_metric.ascore(
    user_input="Explain quantum computing.",
    response="Quantum computing uses qubits that can exist in superposition."
)
print(f"Strict Answer Relevancy: {result.value}")

Related Pages

Explodinggradients_Ragas_Collections_BaseMetric_Class -- Base class for all v2 metrics
Explodinggradients_Ragas_Collections_SemanticSimilarity_Metric -- Pure embedding-based similarity metric
Explodinggradients_Ragas_Collections_ContextRelevance_Metric -- Context relevance evaluation
Explodinggradients_Ragas_Collections_ResponseGroundedness_Metric -- Response groundedness evaluation

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment