Implementation:Intel Ipex llm Get Peft Model QLoRA

Knowledge Sources	IPEX-LLM
Domains	NLP, Parameter_Efficient_Finetuning
Last Updated	2026-02-09 00:00 GMT

Overview

Concrete tools for preparing a 4-bit quantized model and injecting LoRA adapters in QLoRA mode, provided by IPEX-LLM.

Description

Three functions work together: prepare_model_for_kbit_training freezes base model parameters and enables gradient computation on adapter layers. LoraConfig defines the adapter configuration with training_mode="qlora". get_peft_model wraps the prepared model with LoRA adapters. All three are imported from ipex_llm.transformers.qlora to ensure IPEX-LLM compatibility.

Usage

Use after loading a model with BitsAndBytesConfig for QLoRA training. The three functions must be called in sequence: prepare → configure → wrap.

Code Reference

Source Location

Repository: IPEX-LLM
File: python/llm/example/GPU/LLM-Finetuning/QLoRA/alpaca-qlora/alpaca_qlora_finetuning.py
Lines: 209-222

Signature

def prepare_model_for_kbit_training(
    model: PreTrainedModel,
    use_gradient_checkpointing: bool = False
) -> PreTrainedModel:
    """Freeze base model and prepare for k-bit adapter training."""

class LoraConfig:
    def __init__(
        self,
        r: int = 8,
        lora_alpha: int = 16,
        target_modules: List[str] = None,
        lora_dropout: float = 0.05,
        bias: str = "none",
        task_type: str = "CAUSAL_LM",
        training_mode: str = "qlora",
    ):
        """Configure LoRA adapter parameters for QLoRA training."""

def get_peft_model(
    model: PreTrainedModel,
    peft_config: LoraConfig
) -> PeftModel:
    """Wrap model with LoRA adapters according to config."""

Import

from ipex_llm.transformers.qlora import (
    get_peft_model,
    prepare_model_for_kbit_training,
    LoraConfig
)

I/O Contract

Inputs

Name	Type	Required	Description
model	PreTrainedModel	Yes	4-bit quantized base model from AutoModelForCausalLM.from_pretrained
r	int	No	LoRA rank (default 8)
lora_alpha	int	No	LoRA alpha scaling factor (default 16)
target_modules	List[str]	No	Linear layers to inject adapters into (default: q,v,k,o,up,down,gate_proj)
lora_dropout	float	No	Dropout rate for LoRA layers (default 0.05)
training_mode	str	Yes	Must be "qlora" for QLoRA training

Outputs

Name	Type	Description
model	PeftModel	Model with LoRA adapters injected, trainable parameters isolated

Usage Examples

from ipex_llm.transformers.qlora import (
    get_peft_model, prepare_model_for_kbit_training, LoraConfig
)

# 1. Prepare model for k-bit training
model = prepare_model_for_kbit_training(model, use_gradient_checkpointing=False)

# 2. Configure LoRA adapters
config = LoraConfig(
    r=8,
    lora_alpha=16,
    target_modules=["q_proj", "v_proj", "k_proj", "o_proj",
                     "up_proj", "down_proj", "gate_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM",
    training_mode="qlora",
)

# 3. Inject adapters
model = get_peft_model(model, config)
model.print_trainable_parameters()
# Output: trainable params: X || all params: Y || trainable%: Z

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment