Implementation:Intel Ipex llm Alpaca Lora Finetune GPU
| Knowledge Sources | |
|---|---|
| Domains | Finetuning, LoRA, GPU |
| Last Updated | 2026-02-09 04:00 GMT |
Overview
Concrete tool for LoRA fine-tuning of causal language models on GPU using HuggingFace PEFT with 4-bit quantization.
Description
The train() function implements LoRA-based fine-tuning using the HuggingFace PEFT library with 4-bit quantization on GPU. It supports the Alpaca instruction-following dataset format with a configurable prompt template system (Prompter class), automatic gradient accumulation, distributed training, and Weights & Biases logging. Unlike the CPU variant, this script uses 4-bit quantization via BitsAndBytesConfig.
Usage
Use this when fine-tuning a causal language model with LoRA on GPU hardware using 4-bit quantized weights. It is designed for the Alpaca instruction-following format and supports both local JSON files and HuggingFace dataset loading.
Code Reference
Source Location
- Repository: Intel IPEX-LLM
- File: python/llm/example/GPU/LLM-Finetuning/HF-PEFT/alpaca-lora/finetune.py
- Lines: 1-319
Signature
def train(
base_model: str = "",
data_path: str = "yahma/alpaca-cleaned",
output_dir: str = "./lora-alpaca",
batch_size: int = 128,
micro_batch_size: int = 4,
num_epochs: int = 3,
learning_rate: float = 3e-4,
cutoff_len: int = 256,
val_set_size: int = 2000,
lora_r: int = 8,
lora_alpha: int = 16,
lora_dropout: float = 0.05,
lora_target_modules: List[str] = ["q_proj", "v_proj"],
train_on_inputs: bool = True,
add_eos_token: bool = False,
group_by_length: bool = False,
wandb_project: str = "",
resume_from_checkpoint: str = None,
prompt_template_name: str = "alpaca",
):
"""LoRA fine-tuning on GPU with 4-bit quantization."""
Import
# Standalone script; run via fire CLI:
# python finetune.py --base_model "meta-llama/Llama-2-7b-hf"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| base_model | str | Yes | HuggingFace model ID or local path |
| data_path | str | No | HF dataset name or path to JSON (default: yahma/alpaca-cleaned) |
| output_dir | str | No | Directory for saved adapter weights |
| lora_r | int | No | LoRA rank (default: 8) |
| lora_alpha | int | No | LoRA scaling factor (default: 16) |
| prompt_template_name | str | No | Prompt template (default: alpaca) |
Outputs
| Name | Type | Description |
|---|---|---|
| LoRA adapter weights | Files | Saved to output_dir |
| Training metrics | Console/WandB | Loss, learning rate, evaluation metrics |
Usage Examples
Basic GPU LoRA Fine-tuning
python finetune.py \
--base_model "meta-llama/Llama-2-7b-hf" \
--data_path "yahma/alpaca-cleaned" \
--output_dir "./lora-output" \
--num_epochs 3 \
--lora_r 8