Implementation:Explodinggradients Ragas Collections NoiseSensitivity Metric

Field	Value
source	Repo
domains	Metrics, Evaluation
last_updated	2026-02-10 00:00 GMT

Overview

NoiseSensitivity is a v2 class-based metric that measures how often a system makes errors by providing incorrect responses when utilizing relevant or irrelevant retrieved documents, using statement decomposition and natural language inference (NLI).

Description

NoiseSensitivity extends BaseMetric and requires a modern InstructorBaseRagasLLM. It operates in two configurable modes: "relevant" (default) and "irrelevant".

The evaluation algorithm proceeds in four steps:

Statement Decomposition -- Both the reference and response texts are decomposed into atomic statements using StatementGeneratorPrompt. The LLM extracts individual factual claims from each text.
Faithfulness Evaluation -- Each statement (from both reference and response) is evaluated for faithfulness against every retrieved context using NLI via StatementFaithfulnessPrompt, producing binary verdict arrays.
Matrix Construction -- Four boolean matrices are constructed: retrieved2ground_truth, retrieved2answer, and ground_truth2answer, capturing the faithfulness relationships between statements and contexts.
Score Computation -- In "relevant" mode, the score is the mean of incorrect answer statements that are faithful to relevant contexts. In "irrelevant" mode, the score is the mean of incorrect statements faithful to irrelevant contexts (excluding those also faithful to relevant contexts).

A lower score indicates better performance (less sensitivity to noise).

Usage

Instantiate with a required llm parameter and optional mode ("relevant" or "irrelevant"). Call ascore(user_input, response, reference, retrieved_contexts).

Code Reference

Property	Value
Source Location	`src/ragas/metrics/collections/noise_sensitivity/metric.py` L1--235
Signature	`class NoiseSensitivity(BaseMetric)`
Import	`from ragas.metrics.collections import NoiseSensitivity`

I/O Contract

Inputs

Parameter	Type	Required	Description
`user_input`	`str`	Yes	The original question
`response`	`str`	Yes	The generated answer to evaluate
`reference`	`str`	Yes	The ground truth reference answer
`retrieved_contexts`	`List[str]`	Yes	Retrieved contexts used to generate the response

Constructor Parameters

Parameter	Type	Default	Description
`llm`	`InstructorBaseRagasLLM`	(required)	Modern instructor-based LLM for statement generation and NLI
`name`	`str`	`"noise_sensitivity"`	Metric name
`mode`	`Literal["relevant", "irrelevant"]`	`"relevant"`	Context sensitivity mode

Outputs

Field	Type	Description
`MetricResult.value`	`float`	Noise sensitivity score in range 0.0--1.0 (lower is better)

Internal Methods

Method	Description
`_decompose_answer_into_statements`	Breaks text into atomic factual statements via LLM
`_evaluate_statement_faithfulness`	NLI evaluation of statements against a context, returning binary verdicts
`_compute_score`	Computes noise sensitivity from faithfulness boolean matrices

Usage Examples

from openai import AsyncOpenAI
from ragas.llms.base import llm_factory
from ragas.metrics.collections import NoiseSensitivity

# Setup
client = AsyncOpenAI()
llm = llm_factory("gpt-4o-mini", client=client)

# Relevant context noise sensitivity (default)
metric = NoiseSensitivity(llm=llm)
result = await metric.ascore(
    user_input="What is LIC known for?",
    response="LIC is the largest insurance company in India.",
    reference="LIC is known for managing large-scale investments.",
    retrieved_contexts=[
        "LIC was established in 1956 by the Government of India.",
        "LIC manages a large portfolio of investments.",
        "India has many financial institutions.",
    ]
)
print(f"Noise Sensitivity (relevant): {result.value}")

# Irrelevant context noise sensitivity
irr_metric = NoiseSensitivity(llm=llm, mode="irrelevant")
result = await irr_metric.ascore(
    user_input="What is LIC known for?",
    response="LIC is the largest insurance company in India.",
    reference="LIC is known for managing large-scale investments.",
    retrieved_contexts=[
        "LIC was established in 1956 by the Government of India.",
        "The weather in Mumbai is tropical.",
    ]
)
print(f"Noise Sensitivity (irrelevant): {result.value}")

Related Pages

Explodinggradients_Ragas_Collections_BaseMetric_Class -- Base class for all v2 metrics
Explodinggradients_Ragas_Collections_ResponseGroundedness_Metric -- Evaluates groundedness in retrieved contexts
Explodinggradients_Ragas_Collections_ContextPrecision_Metric -- Evaluates context usefulness
Explodinggradients_Ragas_Collections_ContextRelevance_Metric -- Evaluates context pertinence

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment