Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Hpcaitech ColossalAI Ray Performance Evaluator

From Leeroopedia


Knowledge Sources
Domains Reinforcement Learning, Performance Profiling, Distributed Training
Last Updated 2026-02-09 00:00 GMT

Overview

Performance evaluation callbacks for the Ray-based distributed RLHF pipeline, measuring throughput, TFLOPS, and timing breakdowns for both experience making and training phases.

Description

This module provides two callback classes: ExperienceMakerPerformanceEvaluator (subclass of MakerCallback) and TrainerPerformanceEvaluator (subclass of TrainerCallback). The experience maker evaluator tracks make-experience duration, send duration, and computes FLOP counts for actor generation, actor/critic/initial/reward model forward passes. The trainer evaluator tracks training duration, update duration, and computes FLOP counts for actor/critic forward-backward passes with optional gradient checkpoint overhead.

Both evaluators aggregate metrics across distributed workers using all_reduce_mean and print a formatted performance summary at the end of their respective lifecycles. The module also provides utility functions get_world_size, print_rank_0, all_reduce_mean, and a Timer helper class.

Usage

Use ExperienceMakerPerformanceEvaluator as a callback for ExperienceMakerHolder when profiling inference performance during experience generation. Use TrainerPerformanceEvaluator as a callback for DetachedTrainer when profiling training performance. Both are typically enabled via an eval_performance flag during initialization.

Code Reference

Source Location

Signature

class ExperienceMakerPerformanceEvaluator(MakerCallback):
    def __init__(
        self,
        actor_num_params: int,
        critic_num_params: int,
        initial_model_num_params: int,
        reward_model_num_params: int,
    ) -> None: ...

class TrainerPerformanceEvaluator(TrainerCallback):
    def __init__(
        self,
        actor_num_params: int,
        critic_num_params: int,
        enable_grad_checkpoint: bool = False,
        ignore_first_episodes: int = 1,
    ) -> None: ...

def get_world_size() -> int: ...
def print_rank_0(*args, **kwargs) -> None: ...
def all_reduce_mean(x: float, world_size: int) -> float: ...

class Timer:
    def start(self) -> None: ...
    def end(self) -> None: ...
    def reset(self) -> None: ...

Import

from coati.ray.callbacks.performance_evaluator import (
    ExperienceMakerPerformanceEvaluator,
    TrainerPerformanceEvaluator,
    Timer,
)

I/O Contract

Inputs (ExperienceMakerPerformanceEvaluator)

Name Type Required Description
actor_num_params int Yes Number of parameters in the actor model
critic_num_params int Yes Number of parameters in the critic model
initial_model_num_params int Yes Number of parameters in the initial (reference) model
reward_model_num_params int Yes Number of parameters in the reward model

Inputs (TrainerPerformanceEvaluator)

Name Type Required Description
actor_num_params int Yes Number of parameters in the actor model
critic_num_params int Yes Number of parameters in the critic model
enable_grad_checkpoint bool No Whether gradient checkpointing is enabled (default False)
ignore_first_episodes int No Number of initial episodes to skip for warmup (default 1)

Outputs

Name Type Description
return None Performance summary is printed to stdout on rank 0

Usage Examples

from coati.ray.callbacks.performance_evaluator import (
    ExperienceMakerPerformanceEvaluator,
    TrainerPerformanceEvaluator,
)

# For experience maker profiling
maker_evaluator = ExperienceMakerPerformanceEvaluator(
    actor_num_params=7_000_000_000,
    critic_num_params=7_000_000_000,
    initial_model_num_params=7_000_000_000,
    reward_model_num_params=7_000_000_000,
)

# For trainer profiling
trainer_evaluator = TrainerPerformanceEvaluator(
    actor_num_params=7_000_000_000,
    critic_num_params=7_000_000_000,
    enable_grad_checkpoint=True,
    ignore_first_episodes=1,
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment