Implementation:CarperAI Trlx Sweep

Knowledge Sources	CarperAI_Trlx
Domains	Hyperparameter_Optimization, Distributed_Training
Last Updated	2026-02-07 16:00 GMT

Overview

Concrete tool for orchestrating hyperparameter sweeps using Ray Tune with W&B logging and automatic report generation.

Description

The sweep module provides a CLI-driven hyperparameter sweep orchestrator. It parses a YAML config file defining search spaces and Ray Tune settings, translates the config into Ray Tune parameter distributions (uniform, loguniform, choice, grid, etc.), configures search algorithms (BayesOpt, BOHB, random) and schedulers (HyperBand, FIFO), launches experiments via AccelerateTrainer, and generates a W&B report with parallel coordinates, parameter importance, scatter plots, and metric line plots.

Usage

Use this module as a CLI entry point to run hyperparameter sweeps over any trlx training script. Requires a YAML sweep configuration file specifying the parameter search space and a Ray Tune configuration block.

Code Reference

Source Location

Repository: CarperAI_Trlx
File: trlx/sweep.py
Lines: 1-348

Signature

def get_param_space(config: dict) -> dict:
    """Convert YAML search space config to Ray Tune parameter distributions."""

def get_search_alg(tune_config: dict):
    """Create a Ray Tune search algorithm from config (bayesopt, bohb, random)."""

def get_scheduler(tune_config: dict):
    """Create a Ray Tune scheduler from config (hyperband, fifo, etc.)."""

def get_tune_config(tune_config: dict) -> ray.tune.TuneConfig:
    """Build Ray TuneConfig with search algorithm, scheduler, and trial count."""

def create_report(
    target_metric: str,
    column_names: list,
    entity_name: str,
    project_name: str,
    group_name: str,
    best_config: dict,
) -> None:
    """Generate a W&B report with parallel coordinates, scatter plots, and line charts."""

Import

# CLI usage:
# python -m trlx.sweep examples/ppo_sentiments.py --config configs/sweeps/ppo_sweep.yml

# Programmatic usage:
from trlx.sweep import get_param_space, get_tune_config, create_report

I/O Contract

Inputs

Name	Type	Required	Description
script	str (CLI)	Yes	Path to the training script to sweep
--config	str (CLI)	Yes	Path to YAML sweep configuration file
--num_gpus	int (CLI)	No	GPUs per trial (default 1)
--num_cpus	int (CLI)	No	CPUs per trial (default 8)
--accelerate_config	str (CLI)	No	Path to Accelerate config
--server_address	str (CLI)	No	Ray cluster address

Outputs

Name	Type	Description
Ray Tune results	ResultGrid	Sweep trial results with metrics
W&B report	URL	Generated report with visualization panels
Best config	dict	Best hyperparameter configuration found

Usage Examples

Run a PPO Sweep from CLI

# Run a hyperparameter sweep over PPO sentiments example
# python -m trlx.sweep examples/ppo_sentiments.py \
#     --config configs/sweeps/ppo_sweep.yml \
#     --num_gpus 1 \
#     --num_cpus 8

YAML Sweep Config Format

# configs/sweeps/ppo_sweep.yml
tune_config:
  search_alg: bayesopt
  scheduler: hyperband
  num_samples: 20
  metric: reward/mean
  mode: max

param_space:
  train.learning_rate_init:
    method: loguniform
    bounds: [1.0e-6, 1.0e-4]
  method.init_kl_coef:
    method: uniform
    bounds: [0.01, 0.5]
  method.target:
    method: choice
    values: [3.0, 6.0, 12.0]

Related Pages

Environment:CarperAI_Trlx_Python_Accelerate

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment