Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Explodinggradients Ragas Edited Chain Runs Schema

From Leeroopedia


Field Value
source Explodinggradients_Ragas|https://github.com/explodinggradients/ragas
domains Data, Evaluation
last_updated 2026-02-10 00:00 GMT

Overview

A JSON data fixture containing annotated evaluation samples for the answer_correctness metric, including metric inputs with reference answers, LLM-generated outputs, human-edited outputs, and acceptance flags.

Description

The edited_chain_runs.json file is a static data fixture stored in the documentation assets directory. It contains an array of annotated evaluation samples keyed under "answer_correctness". Unlike the helpfulness fixture, each sample in this file includes a reference field in the metric input, enabling comparison of the LLM response against a ground-truth answer. The metric uses the single_turn_aspect_critic_prompt to judge whether the response correctly conveys the factual content of the reference. Each prompt trace includes a nested is_accepted flag at the prompt level in addition to the top-level acceptance flag. Human-edited outputs capture corrections where the LLM's initial verdict or reasoning was modified by a reviewer.

Usage

This file is used in the Ragas documentation to demonstrate metric training and alignment workflows, specifically for the answer_correctness metric. It provides ground-truth annotations that can be loaded programmatically for fine-tuning prompt-based metrics.

import json

with open("docs/_static/edited_chain_runs.json", "r") as f:
    data = json.load(f)

correctness_samples = data["answer_correctness"]
for sample in correctness_samples:
    print(sample["metric_input"]["user_input"][:60], "->", sample["metric_output"])

Code Reference

Field Value
Source Location docs/_static/edited_chain_runs.json
Structure Top-level JSON object with a single key "answer_correctness" mapping to an array of annotation objects
File Size 490 lines

Data Schema

Field Type Description
answer_correctness Array[Object] Array of annotation samples for the answer_correctness metric
answer_correctness[].metric_input Object Contains user_input (string), response (string), and reference (string)
answer_correctness[].metric_output Integer Binary score: 1 (correct) or 0 (incorrect)
answer_correctness[].prompts Object Contains single_turn_aspect_critic_prompt with nested fields
answer_correctness[].prompts.*.prompt_input Object Full context: user_input, response, retrieved_contexts, reference_contexts, reference
answer_correctness[].prompts.*.prompt_output Object LLM judgment: reason (string) and verdict (integer)
answer_correctness[].prompts.*.is_accepted Boolean Whether the prompt-level annotation was accepted
answer_correctness[].prompts.*.edited_output null Human-edited judgment with reason and verdict, or null
answer_correctness[].is_accepted Boolean Top-level acceptance flag for the entire sample

Usage Examples

import json

# Load edited chain runs data
with open("docs/_static/edited_chain_runs.json", "r") as f:
    data = json.load(f)

samples = data["answer_correctness"]

# Analyze verdict distribution
correct = sum(1 for s in samples if s["metric_output"] == 1)
incorrect = sum(1 for s in samples if s["metric_output"] == 0)
print(f"Correct: {correct}, Incorrect: {incorrect}")

# Find samples where human disagreed with LLM
for s in samples:
    prompt = s["prompts"]["single_turn_aspect_critic_prompt"]
    edited = prompt.get("edited_output")
    if edited and edited["verdict"] != prompt["prompt_output"]["verdict"]:
        print(f"Human override: LLM={prompt['prompt_output']['verdict']} -> Human={edited['verdict']}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment