Implementation:PacktPublishing LLM Engineers Handbook FastLanguageModel Get Peft Model

Field	Value
Implementation Name	FastLanguageModel Get Peft Model
Type	Wrapper Doc (Unsloth wraps PEFT)
Source File	llm_engineering/model/finetuning/finetune.py:L45-51
Workflow	LLM_Finetuning
Repo	PacktPublishing/LLM-Engineers-Handbook
Implements	Principle:PacktPublishing_LLM_Engineers_Handbook_LoRA_Adapter_Injection

Function Signature

FastLanguageModel.get_peft_model(
    model,
    r: int,
    lora_alpha: int,
    lora_dropout: float,
    target_modules: List[str],
) -> model

Import

from unsloth import FastLanguageModel

Description

FastLanguageModel.get_peft_model() injects LoRA (Low-Rank Adaptation) adapter layers into the specified modules of a pre-trained language model. This method wraps HuggingFace's PEFT library with Unsloth-specific optimizations, producing a model where only the LoRA adapter weights are trainable while the original weights remain frozen.

After this call, the model is ready for parameter-efficient fine-tuning with a dramatically reduced number of trainable parameters.

Parameters

Parameter	Type	Value in Repo	Description
`model`	Model object	—	The pre-trained model returned by `FastLanguageModel.from_pretrained()`.
`r`	`int`	`32`	LoRA rank. Controls the dimensionality of the low-rank decomposition matrices. Higher values increase expressiveness at the cost of more parameters.
`lora_alpha`	`int`	`32`	LoRA scaling factor. The effective scaling applied to adapter output is `lora_alpha / r`. With `r=32` and `lora_alpha=32`, the scaling factor is 1.0.
`lora_dropout`	`float`	`0`	Dropout probability applied to LoRA adapter outputs during training. 0 means no dropout regularization.
`target_modules`	`List[str]`	See below	List of module names to inject LoRA adapters into.

Target Modules

The repository injects LoRA adapters into all major projection layers:

Module	Layer Type	Description
`q_proj`	Attention	Query projection
`k_proj`	Attention	Key projection
`v_proj`	Attention	Value projection
`o_proj`	Attention	Output projection
`up_proj`	MLP	Feed-forward up-projection
`down_proj`	MLP	Feed-forward down-projection
`gate_proj`	MLP	Gated feed-forward projection

Returns

The same model object, now modified in-place with LoRA adapter layers injected. Only the adapter parameters are marked as trainable.

Key Code in Repository

# From llm_engineering/model/finetuning/finetune.py

model = FastLanguageModel.get_peft_model(
    model,
    r=32,
    lora_alpha=32,
    lora_dropout=0,
    target_modules=[
        "q_proj", "k_proj", "v_proj",
        "up_proj", "down_proj",
        "o_proj", "gate_proj",
    ],
)

Configuration Analysis

r=32: A moderate rank that balances parameter count and expressiveness. For a 7B model, this typically results in ~50-100M trainable parameters out of 7B total (~1-1.5%).
lora_alpha=32: Equal to r, giving an effective scaling of 1.0. This means the adapter contributes equally to the original weights without additional amplification.
lora_dropout=0: No dropout, suggesting the training data is sufficient to avoid overfitting.
All 7 target modules: Comprehensive injection into both attention and MLP layers for maximum adaptation capability.

External Dependencies

Package	Purpose
`unsloth`	Optimized LoRA injection with fused kernels
`peft`	Underlying PEFT/LoRA implementation (wrapped by Unsloth)

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment