Implementation:Unslothai Unsloth SyntheticDataKit

Knowledge Sources	Unsloth vLLM
Domains	Data_Preparation, NLP
Last Updated	2026-02-07 08:40 GMT

Overview

Concrete tool for generating synthetic QA training data from documents using a locally-served vLLM inference backend.

Description

The SyntheticDataKit class launches a vLLM inference server as a subprocess, loads a specified model, and provides methods for chunking input text files and preparing QA generation configurations. It handles subprocess lifecycle management with non-blocking pipe capture via the PipeCapture helper class, including graceful termination of the server process tree.

Usage

Import this class when you need to generate synthetic question-answer training data from raw text documents without external API access. It manages a local vLLM server for generation.

Code Reference

Source Location

Repository: Unslothai_Unsloth
File: unsloth/dataprep/synthetic.py
Lines: 1-465

Signature

class SyntheticDataKit:
    def __init__(
        self,
        model_name: str = "unsloth/Llama-3.1-8B-Instruct-unsloth-bnb-4bit",
        max_seq_length: int = 2048,
        gpu_memory_utilization: float = 0.98,
        float8_kv_cache: bool = False,
        conservativeness: float = 1.0,
        token: str = None,
        timeout: int = 1200,
        **kwargs,
    ):
        """Launches vLLM server subprocess for synthetic data generation."""

    @staticmethod
    def from_pretrained(**kwargs) -> "SyntheticDataKit":
        """Factory method matching constructor signature."""

    def chunk_data(self, filename: str = None) -> list:
        """Chunks text file into overlapping token-bounded segments."""

    def prepare_qa_generation(self) -> None:
        """Sets up YAML config for QA generation workflow."""

    def cleanup(self) -> None:
        """Terminates vLLM server process and frees resources."""

Import

from unsloth.dataprep import SyntheticDataKit

I/O Contract

Inputs

Name	Type	Required	Description
model_name	str	No	HuggingFace model identifier (default: Llama-3.1-8B-Instruct)
max_seq_length	int	No	Maximum context length (default: 2048)
gpu_memory_utilization	float	No	Fraction of GPU memory for vLLM (default: 0.98)
float8_kv_cache	bool	No	Use FP8 KV cache for memory savings (default: False)
conservativeness	float	No	Controls chunk overlap ratio (default: 1.0)
token	str	No	HuggingFace Hub authentication token
timeout	int	No	Server startup timeout in seconds (default: 1200)

Outputs

Name	Type	Description
chunk_data() returns	list	List of chunked text filenames
vLLM server	subprocess	Running inference server on localhost:8000

Usage Examples

Basic Synthetic Data Generation

from unsloth.dataprep import SyntheticDataKit

# Launch vLLM server with model
kit = SyntheticDataKit(
    model_name="unsloth/Llama-3.1-8B-Instruct-unsloth-bnb-4bit",
    max_seq_length=4096,
    gpu_memory_utilization=0.90,
)

# Chunk input text file
chunks = kit.chunk_data(filename="input_document.txt")

# Prepare QA generation config
kit.prepare_qa_generation()

# Cleanup server
kit.cleanup()

Using Context Manager

from unsloth.dataprep import SyntheticDataKit

with SyntheticDataKit(
    model_name="unsloth/Llama-3.1-8B-Instruct-unsloth-bnb-4bit",
    max_seq_length=4096,
) as kit:
    chunks = kit.chunk_data(filename="document.txt")
    kit.prepare_qa_generation()
# Server automatically cleaned up on exit

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment