Principle:Volcengine Verl Model Configuration
| Knowledge Sources | |
|---|---|
| Domains | Model_Architecture, Configuration, NLP |
| Last Updated | 2026-02-07 14:00 GMT |
Overview
A configuration schema that specifies model loading parameters, optional LoRA adapters, and compute optimizations for initializing language models in RL and SFT training pipelines.
Description
Model Configuration in verl defines how a pre-trained language model is loaded and prepared for training. The configuration is expressed as a Hydra-compatible dataclass (HFModelConfig) that controls:
- Model identity: HuggingFace model path or local checkpoint directory
- LoRA parameters: Rank, alpha, target modules, and dropout for parameter-efficient training
- Compute optimizations: Padding removal, fused kernels, gradient checkpointing, Liger kernel integration
- Precision: dtype selection (bf16, fp16, fp32) for mixed-precision training
- Architecture tweaks: Enabling/disabling flash attention, sequence parallelism, vision encoder freezing
This configuration is used by both the RL training pipeline (actor, critic, reference models) and the SFT training pipeline.
Usage
Use model configuration whenever initializing models for training. Key decision points:
- Set
lora_rank > 0for parameter-efficient fine-tuning (saves memory) - Enable
use_remove_padding=Truefor variable-length sequences (significant speedup) - Enable
enable_gradient_checkpointing=Truefor large models (trades compute for memory)
Theoretical Basis
Model configuration bridges pre-trained model selection with training requirements:
LoRA configuration:
Where is the rank, is the scaling factor, and are low-rank matrices.
Key configuration decisions:
# Pseudo-code for model initialization decisions
if lora_rank > 0:
model = load_pretrained(model_path)
model = apply_lora(model, rank=lora_rank, alpha=lora_alpha, targets=target_modules)
else:
model = load_pretrained(model_path) # Full fine-tuning
if enable_gradient_checkpointing:
model.gradient_checkpointing_enable()
if use_remove_padding:
enable_unpadding_optimization(model)