Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Explodinggradients Ragas DataTable To Pandas

From Leeroopedia


Knowledge Sources Domains Last Updated
explodinggradients/ragas Data Analysis, Experiment Management 2026-02-10

Overview

The DataTable To Pandas method converts DataTable, Dataset, and Experiment instances into pandas DataFrames for tabular analysis, aggregation, and visualization.

Description

The to_pandas() method (lines 316-335 of src/ragas/dataset.py) is defined on the DataTable base class and inherited by both Dataset and Experiment. It iterates over the internal data list, converting Pydantic BaseModel instances to dictionaries via model_dump() and passing plain dictionaries through unchanged. The resulting list of dictionaries is fed to pd.DataFrame() to produce a standard pandas DataFrame. The method imports pandas lazily, raising a clear ImportError if pandas is not installed. A complementary from_pandas() class method (lines 201-252) enables the reverse conversion, creating a DataTable from a pandas DataFrame.

Usage

Use the DataTable To Pandas method when:

  • Analyzing experiment results with pandas operations (describe, groupby, merge)
  • Visualizing evaluation scores with matplotlib or seaborn
  • Exporting results to CSV or Excel
  • Computing aggregate statistics across evaluation runs
  • Comparing results from multiple experiments by joining DataFrames

Code Reference

Source Location: src/ragas/dataset.py, lines 316-335 (to_pandas), lines 201-252 (from_pandas)

Signature:

def to_pandas(self) -> "PandasDataFrame":
    """Convert the dataset to a pandas DataFrame."""

Complementary class method:

@classmethod
def from_pandas(
    cls,
    dataframe: "PandasDataFrame",
    name: str,
    backend: Union[BaseBackend, str],
    data_model: Optional[Type[T]] = None,
    **kwargs,
) -> Self:
    """Create a DataTable from a pandas DataFrame."""

Import:

from ragas import Dataset
# or
from ragas import Experiment
# Both inherit to_pandas() from DataTable

I/O Contract

Inputs (to_pandas):

Parameter Type Required Description
self DataTable[T] (or Dataset / Experiment) Yes The DataTable instance containing evaluation data (Pydantic models or dictionaries)

Outputs (to_pandas):

Output Type Description
DataFrame pandas.DataFrame Tabular representation where each row is a dataset entry and each column is a data field

Raises:

Exception Condition
ImportError pandas is not installed
TypeError Dataset contains entries that are neither BaseModel instances nor dictionaries

Inputs (from_pandas):

Parameter Type Required Description
dataframe pandas.DataFrame Yes The DataFrame to convert
name str Yes Name for the resulting DataTable
backend Union[BaseBackend, str] Yes Backend for storage
data_model Optional[Type[T]] No Optional Pydantic model for validation
**kwargs Any No Additional backend configuration

Usage Examples

Basic conversion to DataFrame:

from ragas import Dataset

# Load evaluation results
dataset = Dataset.load("eval_results", "local/csv", root_dir="./data")

# Convert to pandas DataFrame
df = dataset.to_pandas()
print(df.head())
print(df.describe())

Analyzing experiment results:

from ragas import Experiment

# Load experiment results (Experiment inherits to_pandas from DataTable)
experiment = Experiment.load("qa-eval-v1", "local/jsonl", root_dir="./experiments")

# Convert and analyze
df = experiment.to_pandas()

# Summary statistics
print(f"Mean score: {df['score'].mean():.3f}")
print(f"Pass rate: {(df['verdict'] == 'pass').mean():.1%}")

# Group by category
print(df.groupby("category")["score"].mean())

Comparing two experiments:

from ragas import Experiment

exp_v1 = Experiment.load("eval-v1", "local/csv", root_dir="./experiments")
exp_v2 = Experiment.load("eval-v2", "local/csv", root_dir="./experiments")

df_v1 = exp_v1.to_pandas()
df_v2 = exp_v2.to_pandas()

# Compare mean scores
print(f"V1 mean: {df_v1['score'].mean():.3f}")
print(f"V2 mean: {df_v2['score'].mean():.3f}")

# Merge for side-by-side comparison
comparison = df_v1[["user_input", "score"]].merge(
    df_v2[["user_input", "score"]],
    on="user_input",
    suffixes=("_v1", "_v2"),
)
print(comparison.head())

Round-trip: DataFrame to Dataset and back:

import pandas as pd
from ragas import Dataset

# Create a DataFrame
df = pd.DataFrame({
    "user_input": ["What is AI?", "What is ML?"],
    "response": ["Artificial Intelligence", "Machine Learning"],
    "reference": ["AI is a broad field", "ML is a subset of AI"],
})

# Convert to Dataset
dataset = Dataset.from_pandas(df, "from_df", "local/csv", root_dir="./data")
dataset.save()

# Convert back to DataFrame
df_roundtrip = dataset.to_pandas()
print(df_roundtrip)

Exporting to CSV:

from ragas import Experiment

experiment = Experiment.load("final-eval", "local/jsonl", root_dir="./experiments")
df = experiment.to_pandas()

# Export for reporting
df.to_csv("evaluation_report.csv", index=False)
df.to_excel("evaluation_report.xlsx", index=False)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment