Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:PacktPublishing LLM Engineers Handbook FastLanguageModel Get Peft Model

From Leeroopedia


Field Value
Implementation Name FastLanguageModel Get Peft Model
Type Wrapper Doc (Unsloth wraps PEFT)
Source File llm_engineering/model/finetuning/finetune.py:L45-51
Workflow LLM_Finetuning
Repo PacktPublishing/LLM-Engineers-Handbook
Implements Principle:PacktPublishing_LLM_Engineers_Handbook_LoRA_Adapter_Injection

Function Signature

FastLanguageModel.get_peft_model(
    model,
    r: int,
    lora_alpha: int,
    lora_dropout: float,
    target_modules: List[str],
) -> model

Import

from unsloth import FastLanguageModel

Description

FastLanguageModel.get_peft_model() injects LoRA (Low-Rank Adaptation) adapter layers into the specified modules of a pre-trained language model. This method wraps HuggingFace's PEFT library with Unsloth-specific optimizations, producing a model where only the LoRA adapter weights are trainable while the original weights remain frozen.

After this call, the model is ready for parameter-efficient fine-tuning with a dramatically reduced number of trainable parameters.

Parameters

Parameter Type Value in Repo Description
model Model object The pre-trained model returned by FastLanguageModel.from_pretrained().
r int 32 LoRA rank. Controls the dimensionality of the low-rank decomposition matrices. Higher values increase expressiveness at the cost of more parameters.
lora_alpha int 32 LoRA scaling factor. The effective scaling applied to adapter output is lora_alpha / r. With r=32 and lora_alpha=32, the scaling factor is 1.0.
lora_dropout float 0 Dropout probability applied to LoRA adapter outputs during training. 0 means no dropout regularization.
target_modules List[str] See below List of module names to inject LoRA adapters into.

Target Modules

The repository injects LoRA adapters into all major projection layers:

Module Layer Type Description
q_proj Attention Query projection
k_proj Attention Key projection
v_proj Attention Value projection
o_proj Attention Output projection
up_proj MLP Feed-forward up-projection
down_proj MLP Feed-forward down-projection
gate_proj MLP Gated feed-forward projection

Returns

The same model object, now modified in-place with LoRA adapter layers injected. Only the adapter parameters are marked as trainable.

Key Code in Repository

# From llm_engineering/model/finetuning/finetune.py

model = FastLanguageModel.get_peft_model(
    model,
    r=32,
    lora_alpha=32,
    lora_dropout=0,
    target_modules=[
        "q_proj", "k_proj", "v_proj",
        "up_proj", "down_proj",
        "o_proj", "gate_proj",
    ],
)

Configuration Analysis

  • r=32: A moderate rank that balances parameter count and expressiveness. For a 7B model, this typically results in ~50-100M trainable parameters out of 7B total (~1-1.5%).
  • lora_alpha=32: Equal to r, giving an effective scaling of 1.0. This means the adapter contributes equally to the original weights without additional amplification.
  • lora_dropout=0: No dropout, suggesting the training data is sufficient to avoid overfitting.
  • All 7 target modules: Comprehensive injection into both attention and MLP layers for maximum adaptation capability.

External Dependencies

Package Purpose
unsloth Optimized LoRA injection with fused kernels
peft Underlying PEFT/LoRA implementation (wrapped by Unsloth)

See Also

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment