Implementation:Huggingface Transformers Add Adapter For QLoRA
| Knowledge Sources | |
|---|---|
| Domains | Model_Optimization, Quantization, Fine_Tuning, Parameter_Efficient_Fine_Tuning |
| Last Updated | 2026-02-13 00:00 GMT |
Overview
Concrete wrapper for injecting LoRA adapters into a (quantized) pretrained model for QLoRA fine-tuning, provided by Hugging Face Transformers via its PEFT integration.
Description
The add_adapter() method is defined in the PeftAdapterMixin class (integrations/peft.py, line 636) and is inherited by all PreTrainedModel subclasses. It serves as a thin wrapper around PEFT's inject_adapter_in_model() function, providing Transformers-native integration for adding LoRA (and other PEFT) adapters.
The method performs the following steps:
- Validates that the PEFT library is installed and meets the minimum version requirement.
- Assigns a default adapter name (
"default") if none is provided. - Checks that the adapter name is not already registered.
- Validates that the config is an instance of
peft.PeftConfig. - Sets the base model path on the adapter config for serialization.
- Calls
peft.inject_adapter_in_model()to create and inject the adapter layers. - Activates the new adapter via
self.set_adapter(adapter_name).
This method is a wrapper rather than a standalone API because the actual adapter injection logic resides in the PEFT library. The Transformers integration handles lifecycle management (registration, activation, naming) and ensures compatibility with the Transformers model ecosystem.
Usage
Use this API after loading a 4-bit quantized model to add LoRA adapters for QLoRA fine-tuning. It can also be used on non-quantized models for standard LoRA fine-tuning.
Code Reference
Source Location
- Repository: transformers
- File:
src/transformers/integrations/peft.py(lines 636-675)
Signature
class PeftAdapterMixin:
def add_adapter(
self,
adapter_config: PeftConfig,
adapter_name: str | None = None,
) -> None: ...
Import
# add_adapter is a method on model instances
from transformers import AutoModelForCausalLM
from peft import LoraConfig
model = AutoModelForCausalLM.from_pretrained(...)
model.add_adapter(LoraConfig(...))
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| adapter_config | peft.PeftConfig |
Yes | Configuration for the adapter to add. For QLoRA, this is a LoraConfig instance specifying rank, alpha, target modules, and dropout.
|
| adapter_name | str or None |
No (default: "default") |
Name for the adapter. Must be unique across all adapters on the model. |
Outputs
| Name | Type | Description |
|---|---|---|
| (return) | None |
The method modifies the model in-place, injecting adapter layers and activating the new adapter. |
Usage Examples
Standard QLoRA Setup
import torch
from transformers import AutoModelForCausalLM, BitsAndBytesConfig
from peft import LoraConfig
# Step 1: Load model with 4-bit quantization
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True,
)
model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Llama-2-7b-hf",
quantization_config=bnb_config,
device_map="auto",
torch_dtype=torch.float16,
)
# Step 2: Define LoRA configuration
lora_config = LoraConfig(
r=8,
lora_alpha=32,
target_modules=["q_proj", "v_proj"],
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM",
)
# Step 3: Add the adapter
model.add_adapter(lora_config)
# The model now has trainable LoRA parameters on top of frozen 4-bit weights
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
total_params = sum(p.numel() for p in model.parameters())
print(f"Trainable: {trainable_params:,} / {total_params:,} "
f"({100 * trainable_params / total_params:.2f}%)")
Targeting All Linear Layers
from peft import LoraConfig
lora_config = LoraConfig(
r=16,
lora_alpha=32,
target_modules="all-linear",
lora_dropout=0.05,
bias="none",
task_type="CAUSAL_LM",
)
model.add_adapter(lora_config)
Multiple Named Adapters
from peft import LoraConfig
# Add a task-specific adapter
task_a_config = LoraConfig(r=8, lora_alpha=16, target_modules=["q_proj", "v_proj"])
model.add_adapter(task_a_config, adapter_name="task_a")
# Add another adapter for a different task
task_b_config = LoraConfig(r=16, lora_alpha=32, target_modules=["q_proj", "v_proj"])
model.add_adapter(task_b_config, adapter_name="task_b")
# Switch between adapters
model.set_adapter("task_a")
# ... inference or training for task A ...
model.set_adapter("task_b")
# ... inference or training for task B ...