Principle:OpenRLHF OpenRLHF Causal LM Actor Loading

Knowledge Sources	LoRA: Low-Rank Adaptation of Large Language Models QLoRA: Efficient Finetuning of Quantized LLMs HuggingFace Transformers
Domains	NLP, Model_Loading
Last Updated	2026-02-07 00:00 GMT

Overview

A pattern for loading pretrained causal language models with optional parameter-efficient fine-tuning (LoRA) and quantization (4-bit) for RLHF training.

Description

Causal LM Actor Loading wraps the process of loading a pretrained autoregressive language model and optionally applying LoRA adapters and 4-bit quantization. The "Actor" abstraction represents a policy model that generates text based on a learned distribution. It computes log-probabilities of actions (tokens) for policy gradient methods.

The loading process handles: (1) model instantiation from HuggingFace checkpoints, (2) LoRA adapter injection, (3) NF4 quantization for memory efficiency, (4) DeepSpeed ZeRO-3 integration, and (5) MoE model configuration.

Usage

Use this principle when loading a policy model for SFT training, DPO training, knowledge distillation (as student or teacher), or as a reference model in preference optimization. For reward/critic models, use Sequence Regression Model Loading instead.

Theoretical Basis

LoRA (Low-Rank Adaptation): Injects trainable rank decomposition matrices into transformer layers: $W^{'} = W + Δ W = W + B A$ where $B \in ℝ^{d \times r}$ , $A \in ℝ^{r \times k}$ , and $r ≪ \min (d, k)$ .

QLoRA (4-bit): Combines NF4 quantization of the base model with LoRA adapters, reducing memory by ~4x while maintaining full-precision gradient flow through the adapters.

Related Pages

Implemented By

Implementation:OpenRLHF_OpenRLHF_Actor_init

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment