Implementation:Volcengine Verl HFModelConfig

Knowledge Sources	verl
Domains	Model_Configuration, Training_Infrastructure
Last Updated	2026-02-07 14:00 GMT

Overview

Configuration dataclass that defines model architecture, LoRA, optimization, and loading settings for HuggingFace-based models within the verl training framework.

Description

The HFModelConfig dataclass is the central model configuration for all actors, critics, and reference models in verl. It manages the model path, tokenizer loading, HuggingFace config construction, LoRA adapter settings (rank, alpha, target modules), memory optimization flags (gradient checkpointing, activation offloading, remove padding, fused kernels, tiled MLP), and remote code trust settings. During __post_init__, it automatically resolves local paths, loads the tokenizer and processor, constructs the HuggingFace AutoConfig with override parameters, and validates model architectures.

Usage

This config appears as the model sub-config within actor_rollout_ref, critic, and reward_model sections of the Hydra/OmegaConf configuration. Each worker type (actor, critic, reward model) receives its own HFModelConfig instance to initialize its model.

Code Reference

Source Location

Repository: verl
File: verl/workers/config/model.py
Lines: 72-209

Signature

@dataclass
class HFModelConfig(BaseConfig):
    _mutable_fields = {
        "hf_config_path", "tokenizer_path", "hf_config",
        "generation_config", "tokenizer", "processor",
        "local_path", "architectures", "local_hf_config_path",
        "local_tokenizer_path",
    }

    path: str = MISSING
    local_path: Optional[str] = None
    hf_config_path: Optional[str] = None
    local_hf_config_path: Optional[str] = None
    tokenizer_path: Optional[str] = None
    local_tokenizer_path: Optional[str] = None

    load_tokenizer: bool = True
    hf_config: Any = None
    generation_config: Any = None
    tokenizer: Any = None
    processor: Any = None

    use_shm: bool = False
    trust_remote_code: bool = False
    custom_chat_template: Optional[str] = None
    external_lib: Optional[str] = None
    override_config: dict = field(default_factory=dict)

    enable_gradient_checkpointing: bool = True
    enable_activation_offload: bool = False
    use_remove_padding: bool = True

    # LoRA configuration
    lora_rank: int = 0
    lora_alpha: int = 16
    target_modules: Optional[str] = "all-linear"
    target_parameters: Optional[list[str]] = None
    exclude_modules: Optional[str] = None
    lora_adapter_path: Optional[str] = None
    lora: dict[str, Any] = field(default_factory=dict)

    use_liger: bool = False
    use_fused_kernels: bool = False
    fused_kernel_options: dict = field(default_factory=dict)
    tiled_mlp: dict = field(default_factory=lambda: {"enabled": False, "num_shards": 4})

    architectures: Optional[list[str]] = None

Import

from verl.workers.config.model import HFModelConfig

I/O Contract

Inputs (Key Configuration Fields)

Name	Type	Required	Description
path	str	Yes	HuggingFace model ID or local path to model weights
lora_rank	int	No	LoRA rank; 0 disables LoRA (default: 0)
lora_alpha	int	No	LoRA alpha scaling factor (default: 16)
target_modules	Optional[str]	No	LoRA target module specification (default: "all-linear")
exclude_modules	Optional[str]	No	Modules to exclude from LoRA adaptation
enable_gradient_checkpointing	bool	No	Enable gradient checkpointing for memory savings (default: True)
enable_activation_offload	bool	No	Offload activations to CPU during checkpointing (default: False)
use_remove_padding	bool	No	Remove padding for efficient computation (default: True)
use_fused_kernels	bool	No	Use fused CUDA kernels for optimization (default: False)
trust_remote_code	bool	No	Trust remote code when loading models (default: False)
override_config	dict	No	Dictionary of model config overrides (e.g., attn_implementation)
lora_adapter_path	Optional[str]	No	Path to pre-trained LoRA adapter for continued training
custom_chat_template	Optional[str]	No	Custom chat template string for the tokenizer

Outputs (after __post_init__)

Name	Type	Description
tokenizer	Any	Loaded HuggingFace tokenizer instance
processor	Any	Loaded HuggingFace processor instance (for multimodal models)
hf_config	AutoConfig	Loaded and overridden HuggingFace model configuration
generation_config	Any	Generation configuration from the model
local_path	str	Resolved local path to model weights
architectures	list[str]	Model architecture names extracted from config
share_embeddings_and_output_weights	bool	Whether input/output embeddings are tied

Usage Examples

# Configuration (YAML) - Full fine-tuning
# actor_rollout_ref:
#   model:
#     path: Qwen/Qwen2.5-7B
#     enable_gradient_checkpointing: True
#     use_remove_padding: True
#     trust_remote_code: False
#     override_config:
#       attn_implementation: flash_attention_2

# Configuration (YAML) - LoRA fine-tuning
# actor_rollout_ref:
#   model:
#     path: Qwen/Qwen2.5-32B
#     lora_rank: 64
#     lora_alpha: 128
#     target_modules: all-linear
#     enable_gradient_checkpointing: True
#     use_fused_kernels: True

# Programmatic usage
from verl.workers.config.model import HFModelConfig

config = HFModelConfig(
    path="Qwen/Qwen2.5-7B",
    enable_gradient_checkpointing=True,
    use_remove_padding=True,
    use_fused_kernels=True,
    trust_remote_code=False,
    override_config={"attn_implementation": "flash_attention_2"},
)

# After __post_init__, the following are available:
print(config.tokenizer)           # HuggingFace tokenizer
print(config.hf_config)           # AutoConfig with overrides applied
print(config.architectures)       # e.g., ["Qwen2ForCausalLM"]
print(config.local_path)          # Resolved local path

# LoRA configuration for parameter-efficient training
lora_config = HFModelConfig(
    path="Qwen/Qwen2.5-32B",
    lora_rank=64,
    lora_alpha=128,
    target_modules="all-linear",
    exclude_modules="lm_head",
)

Related Pages

Implements Principle

Principle:Volcengine_Verl_Model_Configuration

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment