Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Volcengine Verl Environment Setup

From Leeroopedia


Knowledge Sources
Domains Distributed_Systems, Infrastructure, Training_Infrastructure
Last Updated 2026-02-07 14:00 GMT

Overview

The process of initializing a distributed computing cluster with Ray, allocating GPU resource pools, and configuring the runtime environment for RL or SFT training.

Description

Environment Setup prepares the distributed execution context required by verl's training pipelines. verl uses Ray as its distributed computing framework, with a "single controller" pattern where a driver process orchestrates multiple GPU workers.

The setup involves:

  • Installing verl and its dependencies (vLLM/SGLang for rollout, PyTorch for training)
  • Initializing a Ray cluster with appropriate resource allocation
  • Partitioning GPUs into resource pools for different roles (actor, critic, rollout, reward model)
  • Configuring Hydra for experiment management

This is the first step in every training workflow and must complete before any training logic begins.

Usage

Environment setup is required before any verl training. The setup method varies by workflow:

  • RL training (GRPO/PPO): Uses ray.init() with GPU resource pools, launched via verl.trainer.main_ppo
  • SFT training: Uses torchrun with torch.distributed.init_process_group, launched via verl.trainer.fsdp_sft_trainer

Theoretical Basis

Distributed training setup follows the single-program-multiple-data (SPMD) paradigm for data parallelism and the single-controller pattern for heterogeneous worker orchestration:

Pseudo-code:

# Abstract environment setup
# 1. Install dependencies
pip_install("verl", "vllm", "ray", "torch")

# 2. Initialize cluster
ray.init()

# 3. Allocate GPU pools
resource_pool_actor = GPUPool(gpus_per_node * nodes_for_actor)
resource_pool_critic = GPUPool(gpus_per_node * nodes_for_critic)

# 4. Launch workers
actor_worker = spawn(ActorRolloutRefWorker, resource_pool_actor)
critic_worker = spawn(CriticWorker, resource_pool_critic)

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment