Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Vibrantlabsai Ragas EvaluationDataset From List

From Leeroopedia
Field Value
Source Repository explodinggradients/ragas
Source File src/ragas/dataset_schema.py
Domains NLP, Evaluation
Last Updated 2026-02-12 00:00 GMT

Overview

EvaluationDataset_From_List is the concrete factory method provided by the Ragas library for constructing validated evaluation datasets from raw Python dictionaries. It serves as the primary entry point for converting unstructured evaluation data into the typed EvaluationDataset object that downstream metrics and the evaluate() function require.

Description

The EvaluationDataset.from_list() class method accepts a list of dictionaries and automatically detects whether the data represents single-turn or multi-turn evaluation samples. Detection is based on inspecting whether all items contain a user_input key whose value in the first item is a list (indicating multi-turn conversation messages). Based on this detection:

  • If multi-turn is detected, each dictionary is validated as a MultiTurnSample (Pydantic model with fields for conversation messages, references, tool calls, rubrics, and reference topics).
  • Otherwise, each dictionary is validated as a SingleTurnSample (Pydantic model with fields for user_input, retrieved_contexts, reference_contexts, response, reference, rubrics, and more).

After construction, the EvaluationDataset validates that all samples are of the same type, preventing mixed-type datasets.

Usage

Use EvaluationDataset.from_list() when you have raw evaluation data as Python dictionaries -- for example, loaded from a JSON file, returned from a data pipeline, or generated synthetically -- and need to create a typed dataset suitable for passing to evaluate() or individual metric scoring methods.

Code Reference

Source location: src/ragas/dataset_schema.py, lines 390-405 in class EvaluationDataset

Import statement:

from ragas.dataset_schema import EvaluationDataset

Full method signature:

@classmethod
def from_list(
    cls,
    data: t.List[t.Dict],
    backend: t.Optional[str] = None,
    name: t.Optional[str] = None,
) -> EvaluationDataset:

Supporting classes:

class SingleTurnSample(BaseSample):
    user_input: t.Optional[str] = None
    retrieved_contexts: t.Optional[t.List[str]] = None
    reference_contexts: t.Optional[t.List[str]] = None
    retrieved_context_ids: t.Optional[t.List[t.Union[str, int]]] = None
    reference_context_ids: t.Optional[t.List[t.Union[str, int]]] = None
    response: t.Optional[str] = None
    multi_responses: t.Optional[t.List[str]] = None
    reference: t.Optional[str] = None
    rubrics: t.Optional[t.Dict[str, str]] = None
    persona_name: t.Optional[str] = None
    query_style: t.Optional[str] = None
    query_length: t.Optional[str] = None

class MultiTurnSample(BaseSample):
    user_input: t.List[t.Union[HumanMessage, AIMessage, ToolMessage]]
    reference: t.Optional[str] = None
    reference_tool_calls: t.Optional[t.List[ToolCall]] = None
    rubrics: t.Optional[t.Dict[str, str]] = None
    reference_topics: t.Optional[t.List[str]] = None

I/O Contract

Direction Parameter Type Description
Input data List[Dict] List of dictionaries, each representing one evaluation sample. Keys must match the fields of SingleTurnSample or MultiTurnSample.
Input backend Optional[str] Optional backend identifier for dataset storage (e.g., "local/csv"). Default is None.
Input name Optional[str] Optional human-readable name for the dataset. Default is None.
Output return EvaluationDataset A validated dataset object containing typed samples, iterable and indexable, with methods for conversion to Pandas, HuggingFace, CSV, and JSONL formats.

Usage Examples

Single-turn dataset construction:

from ragas.dataset_schema import EvaluationDataset

data = [
    {
        "user_input": "What is the capital of France?",
        "response": "The capital of France is Paris.",
        "reference": "Paris is the capital of France.",
        "retrieved_contexts": [
            "Paris is the capital and most populous city of France."
        ],
    },
    {
        "user_input": "Who wrote Hamlet?",
        "response": "William Shakespeare wrote Hamlet.",
        "reference": "Hamlet was written by William Shakespeare.",
        "retrieved_contexts": [
            "Hamlet is a tragedy written by William Shakespeare."
        ],
    },
]

dataset = EvaluationDataset.from_list(data)
print(dataset)
# EvaluationDataset(features=['user_input', 'response', 'reference', 'retrieved_contexts'], len=2)

# Convert to Pandas for inspection
df = dataset.to_pandas()

# Iterate over samples
for sample in dataset:
    print(sample.user_input)

Constructing from JSONL file via round-trip:

from ragas.dataset_schema import EvaluationDataset

# Save and reload
dataset.to_jsonl("eval_data.jsonl")
reloaded = EvaluationDataset.from_jsonl("eval_data.jsonl")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment