Environment:OpenRLHF OpenRLHF DeepSpeed Environment

Knowledge Sources	OpenRLHF DeepSpeed
Domains	Infrastructure, Distributed_Training
Last Updated	2026-02-07 10:00 GMT

Overview

DeepSpeed == 0.18.5 with ZeRO stages 0-3, tensor parallelism, and optional CPU offloading for distributed RLHF training.

Description

This environment provides the DeepSpeed distributed training framework configuration required by all non-Ray OpenRLHF training workflows (SFT, DPO, RM, KD, KTO, PRM). DeepSpeed manages ZeRO optimization stages (0-3), optimizer state partitioning, gradient accumulation, and optional CPU offloading. Tensor parallelism requires DeepSpeed >= 0.16.4, and state offloading on ZeRO stages other than 3 requires DeepSpeed > 0.17.5.

Usage

Use this environment for all OpenRLHF training workflows that use the `DeepspeedStrategy`. This includes SFT, DPO, Reward Model, Knowledge Distillation, KTO, and PRM training. The PPO trainer also uses DeepSpeed for its actor and critic training components within Ray actors.

System Requirements

Category	Requirement	Notes
GPU	NVIDIA CUDA GPU	Required for FusedAdam optimizer and NCCL backend
CPU RAM	Proportional to model size	Required when `--adam_offload` or `--offload` enabled
Disk	SSD recommended	For checkpoint saving with ZeRO-aware serialization

Dependencies

Python Packages

`deepspeed` == 0.18.5 (pinned in requirements.txt)
`torch` with distributed support
`peft` (for PEFT model state dict saving)
`torchdata` (for StatefulDataLoader)

Credentials

No additional credentials beyond the base CUDA GPU environment.

Quick Install

pip install deepspeed==0.18.5

Code Evidence

DeepSpeed version requirement for tensor parallelism from `openrlhf/utils/deepspeed/deepspeed.py:72-73`:

if self.ds_tensor_parallel_size > 1:
    assert deepspeed.version >= "0.16.4", "DeepSpeed version must be >= 0.16.4 for tensor parallel training"

State offloading version constraint from `openrlhf/utils/deepspeed/deepspeed_utils.py:151-153`:

if zero_stage != 3 and version.parse(deepspeed.__version__) <= version.parse("0.17.5"):
    raise NotImplementedError(
        "Only Zero stage 3 is currently supported when using DeepSpeed version 0.17.5 or lower"
    )

DeepCompile disabled for inference from `openrlhf/utils/deepspeed/deepspeed_utils.py:77-79`:

# At least for 0.16.6, DeepCompile hasn't support pure inference mode
# https://github.com/deepspeedai/DeepSpeed/pull/7225
deepcompile = False

ZeRO configuration with offloading from `openrlhf/utils/deepspeed/deepspeed_utils.py:20-43`:

device = "cpu" if offload else "none"
zero_opt_dict = {
    "stage": stage,
    "offload_param": {"device": device},
    "offload_optimizer": {
        "device": "cpu" if adam_offload else "none",
        "pin_memory": True,
    },
    ...
}
if stage == 3:
    zero_opt_dict["reduce_scatter"] = True

Optimizer selection from `openrlhf/utils/deepspeed/deepspeed.py:138`:

AdamOptimizer = DeepSpeedCPUAdam if self.adam_offload else FusedAdam

Common Errors

Error Message	Cause	Solution
`DeepSpeed version must be >= 0.16.4 for tensor parallel training`	Old DeepSpeed with `--ds_tensor_parallel_size > 1`	Upgrade to `deepspeed >= 0.16.4`
`Only Zero stage 3 is currently supported when using DeepSpeed version 0.17.5 or lower`	State offloading on ZeRO stage != 3 with old DeepSpeed	Upgrade to `deepspeed > 0.17.5` or use `--zero_stage 3`
DeepCompile inference error	DeepCompile not supported for pure inference	Automatically disabled; no action needed

Compatibility Notes

ZeRO Stage 3: Enables `reduce_scatter` and requires `GatheredParameters` context for parameter access.
CPU Offloading: `--adam_offload` moves optimizer states to CPU; `--offload` moves parameters to CPU. When adam_offload is active, additional state offloading is skipped automatically.
DeepCompile: Disabled for inference mode as of DeepSpeed 0.16.6. Only usable during training.
Tensor Parallelism: Requires DeepSpeed >= 0.16.4 and bf16 dtype.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment