Implementation:OpenRLHF OpenRLHF Ray init placement group

Knowledge Sources	OpenRLHF Ray Documentation
Domains	Distributed_Computing, Training_Infrastructure
Last Updated	2026-02-07 00:00 GMT

Overview

Concrete tool for initializing Ray cluster and GPU placement groups for multi-model PPO training provided by OpenRLHF.

Description

The Ray initialization code in OpenRLHF's PPO training scripts connects to a Ray cluster (or starts one), then creates PlacementGroup objects that reserve specific GPU counts for each model role. The placement groups are passed to RayActorGroup constructors to spawn model workers on the correct GPUs.

This is a Wrapper Doc - it documents OpenRLHF's usage of the Ray API.

Usage

Called at the beginning of PPO/GRPO training scripts, before creating any model actors.

Code Reference

Source Location

Repository: OpenRLHF
File: openrlhf/trainer/ray/launcher.py (and PPO main scripts)

Signature

# Typical Ray initialization pattern in OpenRLHF
import ray
from ray.util.placement_group import placement_group

ray.init(address="auto", namespace="openrlhf")

# Create placement groups for each model
actor_pg = placement_group(
    [{"GPU": 1, "CPU": 1}] * num_actor_gpus,
    strategy="STRICT_PACK"  # or "STRICT_SPREAD"
)
ray.get(actor_pg.ready())

Import

import ray
from ray.util.placement_group import placement_group

I/O Contract

Inputs

Name	Type	Required	Description
address	str	No	Ray cluster address ("auto" for existing cluster)
num_actor_gpus	int	Yes	GPUs for the actor model
num_critic_gpus	int	Yes	GPUs for the critic model
num_vllm_engines	int	Yes	GPUs for vLLM generation

Outputs

Name	Type	Description
placement_groups	PlacementGroup[]	GPU reservations for each model role

Usage Examples

import ray
from ray.util.placement_group import placement_group

# Initialize Ray
ray.init(address="auto", namespace="openrlhf")

# Create placement groups
actor_pg = placement_group(
    [{"GPU": 1, "CPU": 1}] * 4,  # 4 GPUs for actor
    strategy="STRICT_PACK"
)
vllm_pg = placement_group(
    [{"GPU": 1, "CPU": 1}] * 2,  # 2 GPUs for vLLM
    strategy="STRICT_PACK"
)
ray.get([actor_pg.ready(), vllm_pg.ready()])

Related Pages

Implements Principle

Principle:OpenRLHF_OpenRLHF_Ray_Cluster_Initialization

Requires Environment

Environment:OpenRLHF_OpenRLHF_Ray_Distributed_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment