Implementation:Marker Inc Korea AutoRAG Generate QA Ragas
| Knowledge Sources | |
|---|---|
| Domains | Data_Engineering, QA_Generation, Evaluation |
| Last Updated | 2026-02-08 06:00 GMT |
Overview
Concrete tool for generating QA evaluation datasets using the RAGAS (Retrieval Augmented Generation Assessment) framework with configurable question type distributions.
Description
⚠️ LEGACY/DEPRECATED: This module is in the legacy/ directory and is superseded by the modern QA schema pipeline. See Heuristic:Marker_Inc_Korea_AutoRAG_Warning_Deprecated_Legacy_QA_Creation.
The generate_qa_ragas function integrates AutoRAG's legacy QA creation pipeline with the RAGAS evaluation framework. It converts corpus data to LangChain documents, initializes a RAGAS TestsetGenerator with configurable generator LLM, critic LLM, and embedding model (defaulting to OpenAI models), then generates questions distributed across simple, multi_context, and reasoning evolution types. The resulting test set is converted to AutoRAG's QA DataFrame format with qid, query, generation_gt, and retrieval_gt columns.
Usage
Import this function when you want to use RAGAS's diverse question evolution types (simple, multi-context, reasoning) to create evaluation datasets. This is the legacy pipeline's integration point with the RAGAS framework and produces question type diversity that tests different aspects of RAG pipeline performance.
Code Reference
Source Location
- Repository: Marker_Inc_Korea_AutoRAG
- File: autorag/data/legacy/qacreation/ragas.py
- Lines: 1-75
Signature
def generate_qa_ragas(
corpus_df: pd.DataFrame,
test_size: int,
distributions: Optional[dict] = None,
generator_llm: Optional[BaseChatModel] = None,
critic_llm: Optional[BaseChatModel] = None,
embedding_model: Optional[Embeddings] = None,
**kwargs,
) -> pd.DataFrame:
"""
QA dataset generation using RAGAS.
:param corpus_df: Corpus dataframe.
:param test_size: Number of queries to generate.
:param distributions: Distribution of question types.
Default: {simple: 0.5, multi_context: 0.4, reasoning: 0.1}
:param generator_llm: Generator LLM from Langchain (default gpt-3.5-turbo-16k).
:param critic_llm: Critic LLM from Langchain (default gpt-4-turbo).
:param embedding_model: Embedding model from Langchain (default OpenAIEmbeddings).
:param kwargs: Additional options for generate_with_langchain_docs.
:return: QA dataset DataFrame.
"""
Import
from autorag.data.legacy.qacreation.ragas import generate_qa_ragas
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| corpus_df | pd.DataFrame | Yes | Corpus DataFrame with doc_id, contents, and metadata columns |
| test_size | int | Yes | Number of QA pairs to generate |
| distributions | dict | No | Mapping of RAGAS evolution types to ratios (must sum to 1.0) |
| generator_llm | BaseChatModel | No | LangChain chat model for generation (default gpt-3.5-turbo-16k) |
| critic_llm | BaseChatModel | No | LangChain chat model for critique (default gpt-4-turbo) |
| embedding_model | Embeddings | No | LangChain embeddings model (default OpenAIEmbeddings) |
Outputs
| Name | Type | Description |
|---|---|---|
| result_df | pd.DataFrame | DataFrame with qid (UUID str), query (str), generation_gt (List[str]), retrieval_gt (List[List[str]]) |
Usage Examples
Basic RAGAS QA Generation
import pandas as pd
from autorag.data.legacy.qacreation.ragas import generate_qa_ragas
# 1. Load corpus
corpus_df = pd.read_parquet("./corpus.parquet")
# 2. Generate QA dataset with default distributions
# (simple: 0.5, multi_context: 0.4, reasoning: 0.1)
qa_df = generate_qa_ragas(
corpus_df=corpus_df,
test_size=50,
)
print(qa_df.columns.tolist())
# ['qid', 'query', 'generation_gt', 'retrieval_gt']
Custom Distribution and Models
from langchain_openai import ChatOpenAI
from ragas.testset.evolutions import simple, reasoning, multi_context
qa_df = generate_qa_ragas(
corpus_df=corpus_df,
test_size=100,
distributions={simple: 0.3, multi_context: 0.3, reasoning: 0.4},
generator_llm=ChatOpenAI(model="gpt-4"),
critic_llm=ChatOpenAI(model="gpt-4"),
)