Implementation:Explodinggradients Ragas Collections NoiseSensitivity Metric
| Field | Value |
|---|---|
| source | Repo |
| domains | Metrics, Evaluation |
| last_updated | 2026-02-10 00:00 GMT |
Overview
NoiseSensitivity is a v2 class-based metric that measures how often a system makes errors by providing incorrect responses when utilizing relevant or irrelevant retrieved documents, using statement decomposition and natural language inference (NLI).
Description
NoiseSensitivity extends BaseMetric and requires a modern InstructorBaseRagasLLM. It operates in two configurable modes: "relevant" (default) and "irrelevant".
The evaluation algorithm proceeds in four steps:
- Statement Decomposition -- Both the reference and response texts are decomposed into atomic statements using
StatementGeneratorPrompt. The LLM extracts individual factual claims from each text. - Faithfulness Evaluation -- Each statement (from both reference and response) is evaluated for faithfulness against every retrieved context using NLI via
StatementFaithfulnessPrompt, producing binary verdict arrays. - Matrix Construction -- Four boolean matrices are constructed:
retrieved2ground_truth,retrieved2answer, andground_truth2answer, capturing the faithfulness relationships between statements and contexts. - Score Computation -- In
"relevant"mode, the score is the mean of incorrect answer statements that are faithful to relevant contexts. In"irrelevant"mode, the score is the mean of incorrect statements faithful to irrelevant contexts (excluding those also faithful to relevant contexts).
A lower score indicates better performance (less sensitivity to noise).
Usage
Instantiate with a required llm parameter and optional mode ("relevant" or "irrelevant"). Call ascore(user_input, response, reference, retrieved_contexts).
Code Reference
| Property | Value |
|---|---|
| Source Location | src/ragas/metrics/collections/noise_sensitivity/metric.py L1--235
|
| Signature | class NoiseSensitivity(BaseMetric)
|
| Import | from ragas.metrics.collections import NoiseSensitivity
|
I/O Contract
Inputs
| Parameter | Type | Required | Description |
|---|---|---|---|
user_input |
str |
Yes | The original question |
response |
str |
Yes | The generated answer to evaluate |
reference |
str |
Yes | The ground truth reference answer |
retrieved_contexts |
List[str] |
Yes | Retrieved contexts used to generate the response |
Constructor Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
llm |
InstructorBaseRagasLLM |
(required) | Modern instructor-based LLM for statement generation and NLI |
name |
str |
"noise_sensitivity" |
Metric name |
mode |
Literal["relevant", "irrelevant"] |
"relevant" |
Context sensitivity mode |
Outputs
| Field | Type | Description |
|---|---|---|
MetricResult.value |
float |
Noise sensitivity score in range 0.0--1.0 (lower is better) |
Internal Methods
| Method | Description |
|---|---|
_decompose_answer_into_statements |
Breaks text into atomic factual statements via LLM |
_evaluate_statement_faithfulness |
NLI evaluation of statements against a context, returning binary verdicts |
_compute_score |
Computes noise sensitivity from faithfulness boolean matrices |
Usage Examples
from openai import AsyncOpenAI
from ragas.llms.base import llm_factory
from ragas.metrics.collections import NoiseSensitivity
# Setup
client = AsyncOpenAI()
llm = llm_factory("gpt-4o-mini", client=client)
# Relevant context noise sensitivity (default)
metric = NoiseSensitivity(llm=llm)
result = await metric.ascore(
user_input="What is LIC known for?",
response="LIC is the largest insurance company in India.",
reference="LIC is known for managing large-scale investments.",
retrieved_contexts=[
"LIC was established in 1956 by the Government of India.",
"LIC manages a large portfolio of investments.",
"India has many financial institutions.",
]
)
print(f"Noise Sensitivity (relevant): {result.value}")
# Irrelevant context noise sensitivity
irr_metric = NoiseSensitivity(llm=llm, mode="irrelevant")
result = await irr_metric.ascore(
user_input="What is LIC known for?",
response="LIC is the largest insurance company in India.",
reference="LIC is known for managing large-scale investments.",
retrieved_contexts=[
"LIC was established in 1956 by the Government of India.",
"The weather in Mumbai is tropical.",
]
)
print(f"Noise Sensitivity (irrelevant): {result.value}")
Related Pages
- Explodinggradients_Ragas_Collections_BaseMetric_Class -- Base class for all v2 metrics
- Explodinggradients_Ragas_Collections_ResponseGroundedness_Metric -- Evaluates groundedness in retrieved contexts
- Explodinggradients_Ragas_Collections_ContextPrecision_Metric -- Evaluates context usefulness
- Explodinggradients_Ragas_Collections_ContextRelevance_Metric -- Evaluates context pertinence