Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Huggingface Alignment handbook Get Peft Config

From Leeroopedia
Revision as of 15:10, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Huggingface_Alignment_handbook_Get_Peft_Config.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains NLP, Deep_Learning, Optimization
Last Updated 2026-02-07 00:00 GMT

Overview

Concrete tool for creating LoRA adapter configurations from ModelConfig flags, provided by the TRL library and used by all alignment-handbook trainers.

Description

get_peft_config is a TRL utility function that converts ModelConfig PEFT-related fields (use_peft, lora_r, lora_alpha, lora_dropout, lora_target_modules) into a LoraConfig object. When use_peft is False, it returns None, allowing the same training script to support both full fine-tuning and LoRA modes.

The returned config is passed to SFTTrainer, DPOTrainer, or ORPOTrainer as the peft_config parameter, which handles LoRA adapter injection transparently.

Usage

This function is called in every alignment-handbook training script. It returns None for full fine-tuning configs and a LoraConfig for QLoRA configs.

Code Reference

Source Location

  • Repository: alignment-handbook
  • File: scripts/sft.py (line 111), scripts/dpo.py (line 129), scripts/orpo.py (line 128)
  • Definition: External TRL library (trl.trainer.utils)

Signature

def get_peft_config(model_args: ModelConfig) -> Optional[PeftConfig]:
    """Create a PEFT config from ModelConfig fields.

    Args:
        model_args (ModelConfig): Model configuration containing:
            - use_peft (bool): Whether to use PEFT/LoRA
            - lora_r (int): LoRA rank
            - lora_alpha (int): LoRA alpha scaling
            - lora_dropout (float): LoRA dropout rate
            - lora_target_modules (list[str]): Modules to apply LoRA to

    Returns:
        Optional[PeftConfig]: LoraConfig if use_peft=True, None otherwise.
    """

Import

from trl import get_peft_config, ModelConfig

I/O Contract

Inputs

Name Type Required Description
model_args ModelConfig Yes Model configuration from TRL
model_args.use_peft bool Yes Whether to enable PEFT/LoRA (False returns None)
model_args.lora_r int Conditional LoRA rank (required if use_peft=True, e.g., 16 or 128)
model_args.lora_alpha int Conditional LoRA alpha scaling factor (required if use_peft=True)
model_args.lora_dropout float No LoRA dropout rate (default: 0.0, handbook uses 0.05)
model_args.lora_target_modules list[str] No Target modules for LoRA (e.g., [q_proj, k_proj, v_proj, ...])

Outputs

Name Type Description
return Optional[PeftConfig] LoraConfig when use_peft=True, None when use_peft=False. Passed to trainer's peft_config parameter

Usage Examples

In SFTTrainer Initialization

from trl import SFTTrainer, get_peft_config

# get_peft_config returns LoraConfig or None based on model_args.use_peft
trainer = SFTTrainer(
    model=model,
    args=training_args,
    train_dataset=dataset["train"],
    processing_class=tokenizer,
    peft_config=get_peft_config(model_args),  # None for full, LoraConfig for QLoRA
)

QLoRA YAML Config Values

# SFT QLoRA config (lower rank)
use_peft: true
lora_r: 16
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules:
  - q_proj
  - k_proj
  - v_proj
  - o_proj
  - gate_proj
  - up_proj
  - down_proj

# DPO QLoRA config (higher rank for preference learning)
use_peft: true
lora_r: 128
lora_alpha: 128
lora_dropout: 0.05
lora_target_modules:
  - q_proj
  - k_proj
  - v_proj
  - o_proj
  - gate_proj
  - up_proj
  - down_proj

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment