Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Explodinggradients Ragas Collections NoiseSensitivity Metric

From Leeroopedia
Revision as of 14:53, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Explodinggradients_Ragas_Collections_NoiseSensitivity_Metric.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Field Value
source Repo
domains Metrics, Evaluation
last_updated 2026-02-10 00:00 GMT

Overview

NoiseSensitivity is a v2 class-based metric that measures how often a system makes errors by providing incorrect responses when utilizing relevant or irrelevant retrieved documents, using statement decomposition and natural language inference (NLI).

Description

NoiseSensitivity extends BaseMetric and requires a modern InstructorBaseRagasLLM. It operates in two configurable modes: "relevant" (default) and "irrelevant".

The evaluation algorithm proceeds in four steps:

  1. Statement Decomposition -- Both the reference and response texts are decomposed into atomic statements using StatementGeneratorPrompt. The LLM extracts individual factual claims from each text.
  2. Faithfulness Evaluation -- Each statement (from both reference and response) is evaluated for faithfulness against every retrieved context using NLI via StatementFaithfulnessPrompt, producing binary verdict arrays.
  3. Matrix Construction -- Four boolean matrices are constructed: retrieved2ground_truth, retrieved2answer, and ground_truth2answer, capturing the faithfulness relationships between statements and contexts.
  4. Score Computation -- In "relevant" mode, the score is the mean of incorrect answer statements that are faithful to relevant contexts. In "irrelevant" mode, the score is the mean of incorrect statements faithful to irrelevant contexts (excluding those also faithful to relevant contexts).

A lower score indicates better performance (less sensitivity to noise).

Usage

Instantiate with a required llm parameter and optional mode ("relevant" or "irrelevant"). Call ascore(user_input, response, reference, retrieved_contexts).

Code Reference

Property Value
Source Location src/ragas/metrics/collections/noise_sensitivity/metric.py L1--235
Signature class NoiseSensitivity(BaseMetric)
Import from ragas.metrics.collections import NoiseSensitivity

I/O Contract

Inputs

Parameter Type Required Description
user_input str Yes The original question
response str Yes The generated answer to evaluate
reference str Yes The ground truth reference answer
retrieved_contexts List[str] Yes Retrieved contexts used to generate the response

Constructor Parameters

Parameter Type Default Description
llm InstructorBaseRagasLLM (required) Modern instructor-based LLM for statement generation and NLI
name str "noise_sensitivity" Metric name
mode Literal["relevant", "irrelevant"] "relevant" Context sensitivity mode

Outputs

Field Type Description
MetricResult.value float Noise sensitivity score in range 0.0--1.0 (lower is better)

Internal Methods

Method Description
_decompose_answer_into_statements Breaks text into atomic factual statements via LLM
_evaluate_statement_faithfulness NLI evaluation of statements against a context, returning binary verdicts
_compute_score Computes noise sensitivity from faithfulness boolean matrices

Usage Examples

from openai import AsyncOpenAI
from ragas.llms.base import llm_factory
from ragas.metrics.collections import NoiseSensitivity

# Setup
client = AsyncOpenAI()
llm = llm_factory("gpt-4o-mini", client=client)

# Relevant context noise sensitivity (default)
metric = NoiseSensitivity(llm=llm)
result = await metric.ascore(
    user_input="What is LIC known for?",
    response="LIC is the largest insurance company in India.",
    reference="LIC is known for managing large-scale investments.",
    retrieved_contexts=[
        "LIC was established in 1956 by the Government of India.",
        "LIC manages a large portfolio of investments.",
        "India has many financial institutions.",
    ]
)
print(f"Noise Sensitivity (relevant): {result.value}")

# Irrelevant context noise sensitivity
irr_metric = NoiseSensitivity(llm=llm, mode="irrelevant")
result = await irr_metric.ascore(
    user_input="What is LIC known for?",
    response="LIC is the largest insurance company in India.",
    reference="LIC is known for managing large-scale investments.",
    retrieved_contexts=[
        "LIC was established in 1956 by the Government of India.",
        "The weather in Mumbai is tropical.",
    ]
)
print(f"Noise Sensitivity (irrelevant): {result.value}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment