Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Volcengine Verl HFModelConfig

From Leeroopedia


Knowledge Sources
Domains Model_Configuration, Training_Infrastructure
Last Updated 2026-02-07 14:00 GMT

Overview

Configuration dataclass that defines model architecture, LoRA, optimization, and loading settings for HuggingFace-based models within the verl training framework.

Description

The HFModelConfig dataclass is the central model configuration for all actors, critics, and reference models in verl. It manages the model path, tokenizer loading, HuggingFace config construction, LoRA adapter settings (rank, alpha, target modules), memory optimization flags (gradient checkpointing, activation offloading, remove padding, fused kernels, tiled MLP), and remote code trust settings. During __post_init__, it automatically resolves local paths, loads the tokenizer and processor, constructs the HuggingFace AutoConfig with override parameters, and validates model architectures.

Usage

This config appears as the model sub-config within actor_rollout_ref, critic, and reward_model sections of the Hydra/OmegaConf configuration. Each worker type (actor, critic, reward model) receives its own HFModelConfig instance to initialize its model.

Code Reference

Source Location

  • Repository: verl
  • File: verl/workers/config/model.py
  • Lines: 72-209

Signature

@dataclass
class HFModelConfig(BaseConfig):
    _mutable_fields = {
        "hf_config_path", "tokenizer_path", "hf_config",
        "generation_config", "tokenizer", "processor",
        "local_path", "architectures", "local_hf_config_path",
        "local_tokenizer_path",
    }

    path: str = MISSING
    local_path: Optional[str] = None
    hf_config_path: Optional[str] = None
    local_hf_config_path: Optional[str] = None
    tokenizer_path: Optional[str] = None
    local_tokenizer_path: Optional[str] = None

    load_tokenizer: bool = True
    hf_config: Any = None
    generation_config: Any = None
    tokenizer: Any = None
    processor: Any = None

    use_shm: bool = False
    trust_remote_code: bool = False
    custom_chat_template: Optional[str] = None
    external_lib: Optional[str] = None
    override_config: dict = field(default_factory=dict)

    enable_gradient_checkpointing: bool = True
    enable_activation_offload: bool = False
    use_remove_padding: bool = True

    # LoRA configuration
    lora_rank: int = 0
    lora_alpha: int = 16
    target_modules: Optional[str] = "all-linear"
    target_parameters: Optional[list[str]] = None
    exclude_modules: Optional[str] = None
    lora_adapter_path: Optional[str] = None
    lora: dict[str, Any] = field(default_factory=dict)

    use_liger: bool = False
    use_fused_kernels: bool = False
    fused_kernel_options: dict = field(default_factory=dict)
    tiled_mlp: dict = field(default_factory=lambda: {"enabled": False, "num_shards": 4})

    architectures: Optional[list[str]] = None

Import

from verl.workers.config.model import HFModelConfig

I/O Contract

Inputs (Key Configuration Fields)

Name Type Required Description
path str Yes HuggingFace model ID or local path to model weights
lora_rank int No LoRA rank; 0 disables LoRA (default: 0)
lora_alpha int No LoRA alpha scaling factor (default: 16)
target_modules Optional[str] No LoRA target module specification (default: "all-linear")
exclude_modules Optional[str] No Modules to exclude from LoRA adaptation
enable_gradient_checkpointing bool No Enable gradient checkpointing for memory savings (default: True)
enable_activation_offload bool No Offload activations to CPU during checkpointing (default: False)
use_remove_padding bool No Remove padding for efficient computation (default: True)
use_fused_kernels bool No Use fused CUDA kernels for optimization (default: False)
trust_remote_code bool No Trust remote code when loading models (default: False)
override_config dict No Dictionary of model config overrides (e.g., attn_implementation)
lora_adapter_path Optional[str] No Path to pre-trained LoRA adapter for continued training
custom_chat_template Optional[str] No Custom chat template string for the tokenizer

Outputs (after __post_init__)

Name Type Description
tokenizer Any Loaded HuggingFace tokenizer instance
processor Any Loaded HuggingFace processor instance (for multimodal models)
hf_config AutoConfig Loaded and overridden HuggingFace model configuration
generation_config Any Generation configuration from the model
local_path str Resolved local path to model weights
architectures list[str] Model architecture names extracted from config
share_embeddings_and_output_weights bool Whether input/output embeddings are tied

Usage Examples

# Configuration (YAML) - Full fine-tuning
# actor_rollout_ref:
#   model:
#     path: Qwen/Qwen2.5-7B
#     enable_gradient_checkpointing: True
#     use_remove_padding: True
#     trust_remote_code: False
#     override_config:
#       attn_implementation: flash_attention_2

# Configuration (YAML) - LoRA fine-tuning
# actor_rollout_ref:
#   model:
#     path: Qwen/Qwen2.5-32B
#     lora_rank: 64
#     lora_alpha: 128
#     target_modules: all-linear
#     enable_gradient_checkpointing: True
#     use_fused_kernels: True

# Programmatic usage
from verl.workers.config.model import HFModelConfig

config = HFModelConfig(
    path="Qwen/Qwen2.5-7B",
    enable_gradient_checkpointing=True,
    use_remove_padding=True,
    use_fused_kernels=True,
    trust_remote_code=False,
    override_config={"attn_implementation": "flash_attention_2"},
)

# After __post_init__, the following are available:
print(config.tokenizer)           # HuggingFace tokenizer
print(config.hf_config)           # AutoConfig with overrides applied
print(config.architectures)       # e.g., ["Qwen2ForCausalLM"]
print(config.local_path)          # Resolved local path

# LoRA configuration for parameter-efficient training
lora_config = HFModelConfig(
    path="Qwen/Qwen2.5-32B",
    lora_rank=64,
    lora_alpha=128,
    target_modules="all-linear",
    exclude_modules="lm_head",
)

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment