Implementation:Intel Ipex llm Merge Adapter
| Knowledge Sources | |
|---|---|
| Domains | NLP, Model_Deployment |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete tool for merging LoRA adapters into the base model provided by the IPEX-LLM common utilities.
Description
The merge_adapter function loads the original base model at full precision (float16), loads the trained LoRA adapter via PeftModel.from_pretrained, calls merge_and_unload() to combine weights, strips LoRA-specific keys from the state dict, and saves the merged model. It also handles QA-LoRA adapter shape conversion by repeating interleaved quantization blocks.
Usage
Use after QLoRA, LoRA, or DPO training to produce a standalone merged model. Run via the export_merged_model.py script or call merge_adapter() directly.
Code Reference
Source Location
- Repository: IPEX-LLM
- File: python/llm/example/GPU/LLM-Finetuning/common/utils/util.py
- Lines: 141-213
Signature
def merge_adapter(
base_model: str,
tokenizer: AutoTokenizer,
adapter_path: str,
output_path: str
) -> None:
"""Merge LoRA adapter into base model and save to output_path."""
Import
from common.utils import merge_adapter
# Or via export script:
# python export_merged_model.py --base_model MODEL --adapter_path ADAPTER --output_path OUTPUT
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| base_model | str | Yes | HuggingFace model ID or path to original base model |
| tokenizer | AutoTokenizer | Yes | Tokenizer for the base model |
| adapter_path | str | Yes | Path to saved LoRA adapter directory (containing adapter_model.bin) |
| output_path | str | Yes | Directory path where merged model will be saved |
Outputs
| Name | Type | Description |
|---|---|---|
| merged model | Files | Full model weights + tokenizer saved to output_path |
Usage Examples
from transformers import AutoTokenizer
from common.utils import merge_adapter
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-hf")
# Merge adapter into base model
merge_adapter(
base_model="meta-llama/Llama-2-7b-hf",
tokenizer=tokenizer,
adapter_path="./qlora-output",
output_path="./merged-model"
)
# Result: ./merged-model/ contains the full merged model ready for deployment