Implementation:Intel Ipex llm GaLore Finetuning

Knowledge Sources	Intel IPEX-LLM GaLore
Domains	Finetuning, Memory_Efficient_Training, GaLore
Last Updated	2026-02-09 04:00 GMT

Overview

Concrete tool for memory-efficient fine-tuning using the GaLore (Gradient Low-Rank Projection) optimizer with IPEX-LLM on Intel XPU.

Description

This script fine-tunes a causal language model using the GaLore optimizer, which reduces memory usage by projecting gradients into a low-rank subspace. It loads the model with IPEX-LLM's AutoModelForCausalLM on XPU, configures GaLore-specific optimizer parameters (rank, update_proj_gap, scale), and trains using TRL's SFTTrainer with a completion-only data collator for supervised fine-tuning.

Usage

Use this when fine-tuning on XPU with limited GPU memory and standard LoRA rank is insufficient. GaLore provides a complementary approach to LoRA by optimizing the gradient updates rather than adding adapter parameters.

Code Reference

Source Location

Repository: Intel IPEX-LLM
File: python/llm/example/GPU/LLM-Finetuning/GaLore/galore_finetuning.py
Lines: 1-106

Signature

# Script-based execution with argparse
# Key configuration:
training_args = TrainingArguments(
    optim="galore_adamw",
    optim_target_modules=["attn", "mlp"],
    optim_args="rank=1024,update_proj_gap=200,scale=2",
    ...
)
trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    args=training_args,
    train_dataset=dataset,
    data_collator=collator,
)

Import

from ipex_llm.transformers import AutoModelForCausalLM
from transformers import AutoTokenizer, TrainingArguments
from trl import SFTTrainer, DataCollatorForCompletionOnlyLM

I/O Contract

Inputs

Name	Type	Required	Description
repo-id-or-model-path	str	Yes	HuggingFace model ID (default: openlm-research/open_llama_3b_v2)
data-path	str	No	HuggingFace dataset name (default: HuggingFaceH4/helpful_instructions)
output-dir	str	No	Directory for saved model
GaLore rank	int (via optim_args)	No	Gradient projection rank (default: 1024)
update_proj_gap	int (via optim_args)	No	Steps between projection updates (default: 200)

Outputs

Name	Type	Description
Fine-tuned model	Files	Saved to output_dir
Training metrics	Console	Loss and training progress

Usage Examples

GaLore Fine-tuning on XPU

python galore_finetuning.py \
    --repo-id-or-model-path "openlm-research/open_llama_3b_v2" \
    --data-path "HuggingFaceH4/helpful_instructions" \
    --output-dir "./galore-output"

Related Pages

Environment:Intel_Ipex_llm_XPU_Finetuning_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment