Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Explodinggradients Ragas R2R Integration

From Leeroopedia


Metadata Value
Source src/ragas/integrations/r2r.py (Lines 51-127)
Domains Integration, R2R
Last Updated 2026-02-10

Overview

Converts R2R (RAG-to-Riches) client responses into a Ragas EvaluationDataset, enabling evaluation of R2R-based RAG pipelines with Ragas metrics.

Description

This module provides two functions:

  • _process_search_results (internal helper) extracts text from R2R aggregate search results. It processes chunk_search_results (extracting text fields) and web_search_results (extracting snippet fields). It issues warnings for graph_search_results and context_document_results which are not included in the aggregated retrieved contexts.
  • transform_to_ragas_dataset converts R2R response data into a Ragas EvaluationDataset. For each sample:
    • user_input is taken from the user_inputs list.
    • retrieved_contexts are extracted from R2R responses via _process_search_results, processing the search_results from each response's results attribute.
    • response is taken from the R2R response's generated_answer field.
    • Optional reference_contexts, references, and rubrics are included when provided.

The function validates that all provided non-None lists have the same length, raising a ValueError on mismatches.

Usage

Use this integration when you have responses from the R2R client and want to evaluate the quality of retrieval and generation using Ragas metrics. It handles the data format conversion so you can focus on choosing the right evaluation metrics.

Code Reference

Source Location

Item Detail
File src/ragas/integrations/r2r.py
Lines 51-127
Module ragas.integrations.r2r

Signature

def transform_to_ragas_dataset(
    user_inputs: Optional[List[str]] = None,
    r2r_responses: Optional[List] = None,
    reference_contexts: Optional[List[str]] = None,
    references: Optional[List[str]] = None,
    rubrics: Optional[List[Dict[str, str]]] = None,
) -> EvaluationDataset

Import

from ragas.integrations.r2r import transform_to_ragas_dataset

I/O Contract

Inputs

Name Type Required Description
user_inputs Optional[List[str]] No List of user queries
r2r_responses Optional[List] No List of R2R client response objects (must have results.search_results and results.generated_answer)
reference_contexts Optional[List[str]] No Ground-truth reference contexts
references Optional[List[str]] No Ground-truth reference answers
rubrics Optional[List[Dict[str, str]]] No Evaluation rubrics per sample

Outputs

Name Type Description
(return) EvaluationDataset Ragas dataset with samples containing user_input, retrieved_contexts, response, and optional reference fields

Exceptions

Exception Condition
ValueError Provided lists have inconsistent lengths

R2R Response Structure Expected

Path Type Description
response.results.search_results.as_dict() Dict[str, List] Contains chunk_search_results, web_search_results, etc.
response.results.generated_answer str The generated text answer from R2R

Usage Examples

Converting R2R Responses to a Ragas Dataset

from ragas.integrations.r2r import transform_to_ragas_dataset

# Assuming you have R2R client responses
# r2r_client = R2RClient()
# responses = [r2r_client.rag(query) for query in queries]

queries = ["What is RAG?", "How does retrieval work?"]

dataset = transform_to_ragas_dataset(
    user_inputs=queries,
    r2r_responses=responses,
)

# Evaluate with Ragas metrics
from ragas import evaluate
from ragas.metrics import faithfulness, context_precision

results = evaluate(dataset=dataset, metrics=[faithfulness, context_precision])
print(results)

With Reference Answers for Answer Correctness

from ragas.integrations.r2r import transform_to_ragas_dataset

dataset = transform_to_ragas_dataset(
    user_inputs=["What is LLM evaluation?"],
    r2r_responses=[r2r_response],
    references=["LLM evaluation is the systematic assessment of language model outputs."],
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment