Principle:OpenRLHF OpenRLHF Causal LM Actor Loading
| Knowledge Sources | |
|---|---|
| Domains | NLP, Model_Loading |
| Last Updated | 2026-02-07 00:00 GMT |
Overview
A pattern for loading pretrained causal language models with optional parameter-efficient fine-tuning (LoRA) and quantization (4-bit) for RLHF training.
Description
Causal LM Actor Loading wraps the process of loading a pretrained autoregressive language model and optionally applying LoRA adapters and 4-bit quantization. The "Actor" abstraction represents a policy model that generates text based on a learned distribution. It computes log-probabilities of actions (tokens) for policy gradient methods.
The loading process handles: (1) model instantiation from HuggingFace checkpoints, (2) LoRA adapter injection, (3) NF4 quantization for memory efficiency, (4) DeepSpeed ZeRO-3 integration, and (5) MoE model configuration.
Usage
Use this principle when loading a policy model for SFT training, DPO training, knowledge distillation (as student or teacher), or as a reference model in preference optimization. For reward/critic models, use Sequence Regression Model Loading instead.
Theoretical Basis
LoRA (Low-Rank Adaptation): Injects trainable rank decomposition matrices into transformer layers: where , , and .
QLoRA (4-bit): Combines NF4 quantization of the base model with LoRA adapters, reducing memory by ~4x while maintaining full-precision gradient flow through the adapters.