Workflow:Confident ai Deepeval Synthetic Dataset Generation
| Knowledge Sources | |
|---|---|
| Domains | LLM_Evaluation, Data_Engineering, Synthetic_Data |
| Last Updated | 2026-02-14 09:00 GMT |
Overview
End-to-end process for generating synthetic evaluation datasets from documents, contexts, or existing goldens using DeepEval's Synthesizer and ConversationSimulator.
Description
This workflow covers automated generation of evaluation test data for LLM applications. The Synthesizer class generates Golden objects (query-answer pairs with context) from various sources: raw documents, text contexts, or existing goldens. It chunks documents, generates relevant questions, and produces expected answers. For conversational agents, the ConversationSimulator extends goldens into multi-turn conversation scenarios by simulating user interactions. The generated datasets can be saved in JSON, CSV, or JSONL formats and optionally pushed to Confident AI for cloud management.
Usage
Execute this workflow when you need evaluation data but lack manually curated test cases. This applies when bootstrapping a new evaluation suite from existing documentation, generating diverse test scenarios from a knowledge base, augmenting a small hand-curated dataset, or creating multi-turn conversational test scenarios for chatbot evaluation.
Execution Steps
Step 1: Prepare Source Material
Gather the source documents or contexts from which synthetic test data will be generated. Sources can be text files (TXT, Markdown, PDF, DOCX), raw text strings, or existing Golden objects that will be used as seeds for generating variations.
Source types:
- Documents: Text files that will be chunked and used as context for question generation
- Contexts: Pre-extracted text passages provided as lists of strings
- Existing goldens: Golden objects used as seeds for generating similar but varied test cases
Step 2: Configure the Synthesizer
Initialize the Synthesizer with configuration options controlling the generation model, chunking strategy, and output format. The synthesizer uses an LLM to generate questions and answers from the provided context.
Configuration options:
- Model selection for generation (defaults to OpenAI)
- Chunk size and overlap for document processing
- Number of goldens to generate per context
- Embedding model for context generation (uses ChromaDB internally)
Step 3: Generate Goldens
Invoke the appropriate generation method based on your source material. The Synthesizer processes the input, creates contextually relevant questions, and generates expected answers. Each generated golden includes the input question, expected output, and the source context.
Generation methods:
- From documents: Chunks documents, generates contexts via embeddings, then creates goldens
- From contexts: Directly uses provided text passages to generate question-answer pairs
- From goldens: Uses existing goldens as templates to generate variations and augmentations
- From scratch: Generates goldens without any source material using the LLM's knowledge
Step 4: Simulate Conversations (Optional)
For conversational agent evaluation, use the ConversationSimulator to extend single-turn goldens into multi-turn conversation scenarios. The simulator takes a chatbot callback function and conversational goldens, then simulates realistic user interactions.
Simulation features:
- Extends existing turns or starts fresh conversations
- Supports early stopping when conversation goals are met
- Configurable maximum number of turns
- Both sync and async chatbot callback support
Step 5: Review and Export Dataset
Review the generated goldens for quality and export the dataset in the desired format. Optionally push the dataset to Confident AI for cloud-based management, annotation, and sharing with team members.
Export options:
- Local formats: JSON, CSV, JSONL via dataset.save()
- Cloud: Push to Confident AI via dataset.push()
- Review: Interactive review via dataset.review()