Implementation:LLMBook zh LLMBook zh github io AutoModelForCausalLM From Pretrained DPO
Appearance
| Knowledge Sources | |
|---|---|
| Domains | NLP, Alignment |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete tool for loading policy and frozen reference models for DPO training provided by HuggingFace Transformers.
Description
For DPO, AutoModelForCausalLM.from_pretrained is called twice: once for the trainable policy model and once for the frozen reference model. The reference model is explicitly set to eval mode and all parameters have requires_grad=False.
This is a Wrapper Doc documenting how the LLMBook repository uses AutoModelForCausalLM in the DPO context.
Usage
Load both models before creating the DPOTrainer.
Code Reference
Source Location
- Repository: LLMBook-zh
- File: code/8.2 DPO实践.py
- Lines: 58-64
Signature
# Policy model (trainable)
model = AutoModelForCausalLM.from_pretrained(model_name_or_path: str)
# Reference model (frozen)
model_ref = AutoModelForCausalLM.from_pretrained(model_name_or_path: str)
model_ref.eval()
for param in model_ref.parameters():
param.requires_grad = False
Import
from transformers import AutoModelForCausalLM
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model_name_or_path | str | Yes | HuggingFace model ID (e.g., "yulan-team/YuLan-Chat-12B-v3") |
Outputs
| Name | Type | Description |
|---|---|---|
| model | PreTrainedModel | Trainable policy model |
| model_ref | PreTrainedModel | Frozen reference model (eval mode, no grad) |
Usage Examples
from transformers import AutoModelForCausalLM
model_name = "yulan-team/YuLan-Chat-12B-v3"
# Load policy model
model = AutoModelForCausalLM.from_pretrained(model_name)
# Load frozen reference model
model_ref = AutoModelForCausalLM.from_pretrained(model_name)
model_ref.eval()
for param in model_ref.parameters():
param.requires_grad = False
Related Pages
Implements Principle
Requires Environment
- Environment:LLMBook_zh_LLMBook_zh_github_io_PyTorch_CUDA_GPU_Environment
- Environment:LLMBook_zh_LLMBook_zh_github_io_HuggingFace_Transformers_Stack
Uses Heuristic
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment