Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Intel Ipex llm Get Peft Model QLoRA

From Leeroopedia


Knowledge Sources
Domains NLP, Parameter_Efficient_Finetuning
Last Updated 2026-02-09 00:00 GMT

Overview

Concrete tools for preparing a 4-bit quantized model and injecting LoRA adapters in QLoRA mode, provided by IPEX-LLM.

Description

Three functions work together: prepare_model_for_kbit_training freezes base model parameters and enables gradient computation on adapter layers. LoraConfig defines the adapter configuration with training_mode="qlora". get_peft_model wraps the prepared model with LoRA adapters. All three are imported from ipex_llm.transformers.qlora to ensure IPEX-LLM compatibility.

Usage

Use after loading a model with BitsAndBytesConfig for QLoRA training. The three functions must be called in sequence: prepare → configure → wrap.

Code Reference

Source Location

  • Repository: IPEX-LLM
  • File: python/llm/example/GPU/LLM-Finetuning/QLoRA/alpaca-qlora/alpaca_qlora_finetuning.py
  • Lines: 209-222

Signature

def prepare_model_for_kbit_training(
    model: PreTrainedModel,
    use_gradient_checkpointing: bool = False
) -> PreTrainedModel:
    """Freeze base model and prepare for k-bit adapter training."""

class LoraConfig:
    def __init__(
        self,
        r: int = 8,
        lora_alpha: int = 16,
        target_modules: List[str] = None,
        lora_dropout: float = 0.05,
        bias: str = "none",
        task_type: str = "CAUSAL_LM",
        training_mode: str = "qlora",
    ):
        """Configure LoRA adapter parameters for QLoRA training."""

def get_peft_model(
    model: PreTrainedModel,
    peft_config: LoraConfig
) -> PeftModel:
    """Wrap model with LoRA adapters according to config."""

Import

from ipex_llm.transformers.qlora import (
    get_peft_model,
    prepare_model_for_kbit_training,
    LoraConfig
)

I/O Contract

Inputs

Name Type Required Description
model PreTrainedModel Yes 4-bit quantized base model from AutoModelForCausalLM.from_pretrained
r int No LoRA rank (default 8)
lora_alpha int No LoRA alpha scaling factor (default 16)
target_modules List[str] No Linear layers to inject adapters into (default: q,v,k,o,up,down,gate_proj)
lora_dropout float No Dropout rate for LoRA layers (default 0.05)
training_mode str Yes Must be "qlora" for QLoRA training

Outputs

Name Type Description
model PeftModel Model with LoRA adapters injected, trainable parameters isolated

Usage Examples

from ipex_llm.transformers.qlora import (
    get_peft_model, prepare_model_for_kbit_training, LoraConfig
)

# 1. Prepare model for k-bit training
model = prepare_model_for_kbit_training(model, use_gradient_checkpointing=False)

# 2. Configure LoRA adapters
config = LoraConfig(
    r=8,
    lora_alpha=16,
    target_modules=["q_proj", "v_proj", "k_proj", "o_proj",
                     "up_proj", "down_proj", "gate_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM",
    training_mode="qlora",
)

# 3. Inject adapters
model = get_peft_model(model, config)
model.print_trainable_parameters()
# Output: trainable params: X || all params: Y || trainable%: Z

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment