Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Alibaba ROLL RewardFLConfig

From Leeroopedia


Knowledge Sources
Domains Diffusion_Models, Configuration
Last Updated 2026-02-07 20:00 GMT

Overview

Concrete reward flow configuration dataclass for video diffusion training provided by the Alibaba ROLL library.

Description

The RewardFLConfig class extends BaseConfig with settings for diffusion model training including batch size, gradient norm, and actor_train WorkerConfig with strategy_name="diffusion_deepspeed_train".

Usage

Loaded from YAML via Hydra for reward flow pipelines.

Code Reference

Source Location

  • Repository: Alibaba ROLL
  • File: roll/pipeline/diffusion/reward_fl/reward_fl_config.py
  • Lines: L12-48

Signature

@dataclass
class RewardFLConfig(BaseConfig):
    """
    Configuration for reward flow diffusion training.

    Attributes:
        train_batch_size: int = 8 - batch size
        max_grad_norm: float = 1.0 - gradient clipping
        actor_train: WorkerConfig - worker config with diffusion_deepspeed_train strategy
    """

Import

from roll.pipeline.diffusion.reward_fl.reward_fl_config import RewardFLConfig

I/O Contract

Inputs

Name Type Required Description
YAML config file str Yes Hydra-managed YAML with diffusion model paths

Outputs

Name Type Description
RewardFLConfig RewardFLConfig Config with actor_train WorkerConfig

Usage Examples

from hydra import compose, initialize
import dacite
from omegaconf import OmegaConf

initialize(config_path="examples/wan2.2-14B-reward_fl_ds")
cfg = compose(config_name="reward_fl_config")
config = dacite.from_dict(data_class=RewardFLConfig, data=OmegaConf.to_container(cfg, resolve=True))

Related Pages

Implements Principle

Requires Environment

Environment Dependencies

This implementation requires the following environment constraints:

Heuristics Applied

No specific heuristics apply to this implementation.

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment