Implementation:Explodinggradients Ragas Edited Chain Runs Schema
| Field | Value |
|---|---|
| source | Explodinggradients_Ragas|https://github.com/explodinggradients/ragas |
| domains | Data, Evaluation |
| last_updated | 2026-02-10 00:00 GMT |
Overview
A JSON data fixture containing annotated evaluation samples for the answer_correctness metric, including metric inputs with reference answers, LLM-generated outputs, human-edited outputs, and acceptance flags.
Description
The edited_chain_runs.json file is a static data fixture stored in the documentation assets directory. It contains an array of annotated evaluation samples keyed under "answer_correctness". Unlike the helpfulness fixture, each sample in this file includes a reference field in the metric input, enabling comparison of the LLM response against a ground-truth answer. The metric uses the single_turn_aspect_critic_prompt to judge whether the response correctly conveys the factual content of the reference. Each prompt trace includes a nested is_accepted flag at the prompt level in addition to the top-level acceptance flag. Human-edited outputs capture corrections where the LLM's initial verdict or reasoning was modified by a reviewer.
Usage
This file is used in the Ragas documentation to demonstrate metric training and alignment workflows, specifically for the answer_correctness metric. It provides ground-truth annotations that can be loaded programmatically for fine-tuning prompt-based metrics.
import json
with open("docs/_static/edited_chain_runs.json", "r") as f:
data = json.load(f)
correctness_samples = data["answer_correctness"]
for sample in correctness_samples:
print(sample["metric_input"]["user_input"][:60], "->", sample["metric_output"])
Code Reference
| Field | Value |
|---|---|
| Source Location | docs/_static/edited_chain_runs.json
|
| Structure | Top-level JSON object with a single key "answer_correctness" mapping to an array of annotation objects
|
| File Size | 490 lines |
Data Schema
| Field | Type | Description |
|---|---|---|
| answer_correctness | Array[Object] |
Array of annotation samples for the answer_correctness metric |
| answer_correctness[].metric_input | Object |
Contains user_input (string), response (string), and reference (string)
|
| answer_correctness[].metric_output | Integer |
Binary score: 1 (correct) or 0 (incorrect)
|
| answer_correctness[].prompts | Object |
Contains single_turn_aspect_critic_prompt with nested fields
|
| answer_correctness[].prompts.*.prompt_input | Object |
Full context: user_input, response, retrieved_contexts, reference_contexts, reference
|
| answer_correctness[].prompts.*.prompt_output | Object |
LLM judgment: reason (string) and verdict (integer)
|
| answer_correctness[].prompts.*.is_accepted | Boolean |
Whether the prompt-level annotation was accepted |
| answer_correctness[].prompts.*.edited_output | null | Human-edited judgment with reason and verdict, or null
|
| answer_correctness[].is_accepted | Boolean |
Top-level acceptance flag for the entire sample |
Usage Examples
import json
# Load edited chain runs data
with open("docs/_static/edited_chain_runs.json", "r") as f:
data = json.load(f)
samples = data["answer_correctness"]
# Analyze verdict distribution
correct = sum(1 for s in samples if s["metric_output"] == 1)
incorrect = sum(1 for s in samples if s["metric_output"] == 0)
print(f"Correct: {correct}, Incorrect: {incorrect}")
# Find samples where human disagreed with LLM
for s in samples:
prompt = s["prompts"]["single_turn_aspect_critic_prompt"]
edited = prompt.get("edited_output")
if edited and edited["verdict"] != prompt["prompt_output"]["verdict"]:
print(f"Human override: LLM={prompt['prompt_output']['verdict']} -> Human={edited['verdict']}")
Related Pages
- Explodinggradients_Ragas_Annotated_Data_Schema -- Similar annotation fixture for helpfulness metric
- Explodinggradients_Ragas_Sample_Annotated_Summary_Schema -- Similar annotation fixture for summary_accuracy metric
- Explodinggradients_Ragas_MkDocs_Configuration -- Documentation site configuration that serves this data