Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Explodinggradients Ragas Collections AnswerRelevancy Metric

From Leeroopedia
Revision as of 14:53, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Explodinggradients_Ragas_Collections_AnswerRelevancy_Metric.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Field Value
source Repo
domains Metrics, Evaluation
last_updated 2026-02-10 00:00 GMT

Overview

AnswerRelevancy is a v2 class-based metric that evaluates how relevant a response is to the original question using a dual-component design combining LLM-based question generation with embedding cosine similarity.

Description

AnswerRelevancy extends BaseMetric and requires both a modern InstructorBaseRagasLLM and a BaseRagasEmbedding. The evaluation algorithm works as follows:

  1. For each iteration up to strictness count (default 3), the LLM generates a synthetic question from the response along with a noncommittal flag indicating if the response is evasive.
  2. The original question and all generated questions are embedded using the embeddings model.
  3. Cosine similarity is computed between the original question vector and each generated question vector.
  4. The final score is the mean cosine similarity, reduced to 0.0 if all generated responses were flagged as noncommittal.

This approach measures whether the response actually addresses the question -- a relevant answer should allow reconstruction of the original question. The metric uses structured prompts via AnswerRelevancePrompt with AnswerRelevanceInput / AnswerRelevanceOutput Pydantic models.

Usage

Instantiate with required llm and embeddings parameters and optional strictness (number of generated questions). Call ascore(user_input, response). Both components are validated by the base class to ensure they are modern implementations.

Code Reference

Property Value
Source Location src/ragas/metrics/collections/answer_relevancy/metric.py L1--157
Signature class AnswerRelevancy(BaseMetric)
Import from ragas.metrics.collections import AnswerRelevancy

I/O Contract

Inputs

Parameter Type Required Description
user_input str Yes The original question
response str Yes The response text to evaluate

Constructor Parameters

Parameter Type Default Description
llm InstructorBaseRagasLLM (required) Modern instructor-based LLM for question generation
embeddings BaseRagasEmbedding (required) Modern embeddings model for semantic comparison
name str "answer_relevancy" Metric name
strictness int 3 Number of questions to generate per evaluation

Outputs

Field Type Description
MetricResult.value float Answer relevancy score in range 0.0--1.0 (higher is better)

Usage Examples

import openai
from ragas.llms.base import llm_factory
from ragas.embeddings.base import embedding_factory
from ragas.metrics.collections import AnswerRelevancy

# Setup dependencies
client = openai.AsyncOpenAI()
llm = llm_factory("gpt-4o-mini", client=client)
embeddings = embedding_factory(
    "openai", model="text-embedding-ada-002",
    client=client, interface="modern"
)

# Create metric
metric = AnswerRelevancy(llm=llm, embeddings=embeddings, strictness=3)

# Single evaluation
result = await metric.ascore(
    user_input="What is the capital of France?",
    response="Paris is the capital of France."
)
print(f"Answer Relevancy: {result.value}")

# Higher strictness for more robust evaluation
strict_metric = AnswerRelevancy(llm=llm, embeddings=embeddings, strictness=5)
result = await strict_metric.ascore(
    user_input="Explain quantum computing.",
    response="Quantum computing uses qubits that can exist in superposition."
)
print(f"Strict Answer Relevancy: {result.value}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment