Implementation:Vibrantlabsai Ragas EvaluationDataset From List
| Field | Value |
|---|---|
| Source Repository | explodinggradients/ragas |
| Source File | src/ragas/dataset_schema.py
|
| Domains | NLP, Evaluation |
| Last Updated | 2026-02-12 00:00 GMT |
Overview
EvaluationDataset_From_List is the concrete factory method provided by the Ragas library for constructing validated evaluation datasets from raw Python dictionaries. It serves as the primary entry point for converting unstructured evaluation data into the typed EvaluationDataset object that downstream metrics and the evaluate() function require.
Description
The EvaluationDataset.from_list() class method accepts a list of dictionaries and automatically detects whether the data represents single-turn or multi-turn evaluation samples. Detection is based on inspecting whether all items contain a user_input key whose value in the first item is a list (indicating multi-turn conversation messages). Based on this detection:
- If multi-turn is detected, each dictionary is validated as a
MultiTurnSample(Pydantic model with fields for conversation messages, references, tool calls, rubrics, and reference topics). - Otherwise, each dictionary is validated as a
SingleTurnSample(Pydantic model with fields foruser_input,retrieved_contexts,reference_contexts,response,reference,rubrics, and more).
After construction, the EvaluationDataset validates that all samples are of the same type, preventing mixed-type datasets.
Usage
Use EvaluationDataset.from_list() when you have raw evaluation data as Python dictionaries -- for example, loaded from a JSON file, returned from a data pipeline, or generated synthetically -- and need to create a typed dataset suitable for passing to evaluate() or individual metric scoring methods.
Code Reference
Source location: src/ragas/dataset_schema.py, lines 390-405 in class EvaluationDataset
Import statement:
from ragas.dataset_schema import EvaluationDataset
Full method signature:
@classmethod
def from_list(
cls,
data: t.List[t.Dict],
backend: t.Optional[str] = None,
name: t.Optional[str] = None,
) -> EvaluationDataset:
Supporting classes:
class SingleTurnSample(BaseSample):
user_input: t.Optional[str] = None
retrieved_contexts: t.Optional[t.List[str]] = None
reference_contexts: t.Optional[t.List[str]] = None
retrieved_context_ids: t.Optional[t.List[t.Union[str, int]]] = None
reference_context_ids: t.Optional[t.List[t.Union[str, int]]] = None
response: t.Optional[str] = None
multi_responses: t.Optional[t.List[str]] = None
reference: t.Optional[str] = None
rubrics: t.Optional[t.Dict[str, str]] = None
persona_name: t.Optional[str] = None
query_style: t.Optional[str] = None
query_length: t.Optional[str] = None
class MultiTurnSample(BaseSample):
user_input: t.List[t.Union[HumanMessage, AIMessage, ToolMessage]]
reference: t.Optional[str] = None
reference_tool_calls: t.Optional[t.List[ToolCall]] = None
rubrics: t.Optional[t.Dict[str, str]] = None
reference_topics: t.Optional[t.List[str]] = None
I/O Contract
| Direction | Parameter | Type | Description |
|---|---|---|---|
| Input | data |
List[Dict] |
List of dictionaries, each representing one evaluation sample. Keys must match the fields of SingleTurnSample or MultiTurnSample.
|
| Input | backend |
Optional[str] |
Optional backend identifier for dataset storage (e.g., "local/csv"). Default is None.
|
| Input | name |
Optional[str] |
Optional human-readable name for the dataset. Default is None.
|
| Output | return | EvaluationDataset |
A validated dataset object containing typed samples, iterable and indexable, with methods for conversion to Pandas, HuggingFace, CSV, and JSONL formats. |
Usage Examples
Single-turn dataset construction:
from ragas.dataset_schema import EvaluationDataset
data = [
{
"user_input": "What is the capital of France?",
"response": "The capital of France is Paris.",
"reference": "Paris is the capital of France.",
"retrieved_contexts": [
"Paris is the capital and most populous city of France."
],
},
{
"user_input": "Who wrote Hamlet?",
"response": "William Shakespeare wrote Hamlet.",
"reference": "Hamlet was written by William Shakespeare.",
"retrieved_contexts": [
"Hamlet is a tragedy written by William Shakespeare."
],
},
]
dataset = EvaluationDataset.from_list(data)
print(dataset)
# EvaluationDataset(features=['user_input', 'response', 'reference', 'retrieved_contexts'], len=2)
# Convert to Pandas for inspection
df = dataset.to_pandas()
# Iterate over samples
for sample in dataset:
print(sample.user_input)
Constructing from JSONL file via round-trip:
from ragas.dataset_schema import EvaluationDataset
# Save and reload
dataset.to_jsonl("eval_data.jsonl")
reloaded = EvaluationDataset.from_jsonl("eval_data.jsonl")