Implementation:OpenRLHF OpenRLHF Create vllm engines
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Inference, Training_Infrastructure |
| Last Updated | 2026-02-07 00:00 GMT |
Overview
Concrete tool for creating vLLM generation engines as Ray actors provided by OpenRLHF.
Description
The create_vllm_engines function spawns one or more vLLM LLMEngine instances as Ray actors, each bound to a specific GPU placement group. The engines are configured with model path, tensor parallel size, GPU memory utilization, and generation parameters. They support weight updates from the training actor for on-policy generation.
Usage
Called during PPO/GRPO initialization after Ray cluster setup. The returned engine references are passed to the PPO trainer for sample generation.
Code Reference
Source Location
- Repository: OpenRLHF
- File: openrlhf/trainer/ray/vllm_engine.py
Signature
def create_vllm_engines(
num_engines: int, # Number of vLLM engine instances
pretrain: str, # Model path or HF ID
max_len: int, # Maximum sequence length
gpu_memory_utilization: float, # vLLM GPU memory fraction
tensor_parallel_size: int = 1, # TP size per engine
seed: int = 42,
enable_prefix_caching: bool = False,
**kwargs,
) -> list:
"""Returns list of Ray actor handles for vLLM engines."""
Import
from openrlhf.trainer.ray.vllm_engine import create_vllm_engines
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| num_engines | int | Yes | Number of vLLM engine instances to create |
| pretrain | str | Yes | Model checkpoint path |
| gpu_memory_utilization | float | Yes | Fraction of GPU memory for KV-cache |
Outputs
| Name | Type | Description |
|---|---|---|
| engines | list | Ray actor handles for vLLM engine instances |
Usage Examples
from openrlhf.trainer.ray.vllm_engine import create_vllm_engines
vllm_engines = create_vllm_engines(
num_engines=2,
pretrain="meta-llama/Llama-2-7b-hf",
max_len=2048,
gpu_memory_utilization=0.8,
tensor_parallel_size=1,
)
Related Pages
Implements Principle
Requires Environment
- Environment:OpenRLHF_OpenRLHF_vLLM_Environment
- Environment:OpenRLHF_OpenRLHF_Ray_Distributed_Environment
Uses Heuristic
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment