Principle:Volcengine Verl Multi Turn Data Preparation

Knowledge Sources	verl verl Multi-Turn Documentation
Domains	Data_Engineering, Agentic_AI, Tool_Use
Last Updated	2026-02-07 14:00 GMT

Overview

The process of preparing datasets for multi-turn agentic RL training, including system prompts with tool instructions and tool-calling keyword arguments embedded in data rows.

Description

Multi-Turn Data Preparation extends standard RL data preparation with additional fields needed for agentic training where the model interacts with external tools across multiple conversation turns.

Key additions beyond standard RL data:

System prompt: Instructions telling the model about available tools and expected output format
tools_kwargs: Configuration for tool instantiation, including per-row parameters (e.g., ground truth for a calculator tool)
interaction_kwargs: Parameters controlling the multi-turn interaction (e.g., max turns)

The tool configuration is embedded directly in each data row, allowing different rows to have different tool setups.

Usage

Use multi-turn data preparation when training models for:

Tool-calling capabilities (calculator, code execution, search)
Multi-step reasoning with external feedback
Agentic workflows where the model must decide when to use tools

Theoretical Basis

Multi-turn data extends the standard schema with tool configuration:

# Abstract multi-turn data preparation
for row in dataset:
    prompt = [
        {"role": "system", "content": tool_use_instructions},
        {"role": "user", "content": row["question"]}
    ]
    extra_info = {
        "need_tools_kwargs": True,
        "tools_kwargs": {
            "tool_name": {
                "create_kwargs": {"ground_truth": extract_answer(row)}
            }
        },
        "interaction_kwargs": {"max_turns": 5}
    }

Related Pages

Implemented By

Implementation:Volcengine_Verl_Multi_Turn_Data_Preprocessing

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment