Implementation:Huggingface Transformers Add Adapter For QLoRA

Knowledge Sources	Transformers PEFT Integration PEFT Documentation
Domains	Model_Optimization, Quantization, Fine_Tuning, Parameter_Efficient_Fine_Tuning
Last Updated	2026-02-13 00:00 GMT

Overview

Concrete wrapper for injecting LoRA adapters into a (quantized) pretrained model for QLoRA fine-tuning, provided by Hugging Face Transformers via its PEFT integration.

Description

The add_adapter() method is defined in the PeftAdapterMixin class (integrations/peft.py, line 636) and is inherited by all PreTrainedModel subclasses. It serves as a thin wrapper around PEFT's inject_adapter_in_model() function, providing Transformers-native integration for adding LoRA (and other PEFT) adapters.

The method performs the following steps:

Validates that the PEFT library is installed and meets the minimum version requirement.
Assigns a default adapter name ("default") if none is provided.
Checks that the adapter name is not already registered.
Validates that the config is an instance of peft.PeftConfig.
Sets the base model path on the adapter config for serialization.
Calls peft.inject_adapter_in_model() to create and inject the adapter layers.
Activates the new adapter via self.set_adapter(adapter_name).

This method is a wrapper rather than a standalone API because the actual adapter injection logic resides in the PEFT library. The Transformers integration handles lifecycle management (registration, activation, naming) and ensures compatibility with the Transformers model ecosystem.

Usage

Use this API after loading a 4-bit quantized model to add LoRA adapters for QLoRA fine-tuning. It can also be used on non-quantized models for standard LoRA fine-tuning.

Code Reference

Source Location

Repository: transformers
File: src/transformers/integrations/peft.py (lines 636-675)

Signature

class PeftAdapterMixin:
    def add_adapter(
        self,
        adapter_config: PeftConfig,
        adapter_name: str | None = None,
    ) -> None: ...

Import

# add_adapter is a method on model instances
from transformers import AutoModelForCausalLM
from peft import LoraConfig

model = AutoModelForCausalLM.from_pretrained(...)
model.add_adapter(LoraConfig(...))

I/O Contract

Inputs

Name	Type	Required	Description
adapter_config	`peft.PeftConfig`	Yes	Configuration for the adapter to add. For QLoRA, this is a `LoraConfig` instance specifying rank, alpha, target modules, and dropout.
adapter_name	`str` or `None`	No (default: `"default"`)	Name for the adapter. Must be unique across all adapters on the model.

Outputs

Name	Type	Description
(return)	`None`	The method modifies the model in-place, injecting adapter layers and activating the new adapter.

Usage Examples

Standard QLoRA Setup

import torch
from transformers import AutoModelForCausalLM, BitsAndBytesConfig
from peft import LoraConfig

# Step 1: Load model with 4-bit quantization
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
)

model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-2-7b-hf",
    quantization_config=bnb_config,
    device_map="auto",
    torch_dtype=torch.float16,
)

# Step 2: Define LoRA configuration
lora_config = LoraConfig(
    r=8,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM",
)

# Step 3: Add the adapter
model.add_adapter(lora_config)

# The model now has trainable LoRA parameters on top of frozen 4-bit weights
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
total_params = sum(p.numel() for p in model.parameters())
print(f"Trainable: {trainable_params:,} / {total_params:,} "
      f"({100 * trainable_params / total_params:.2f}%)")

Targeting All Linear Layers

from peft import LoraConfig

lora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules="all-linear",
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM",
)

model.add_adapter(lora_config)

Multiple Named Adapters

from peft import LoraConfig

# Add a task-specific adapter
task_a_config = LoraConfig(r=8, lora_alpha=16, target_modules=["q_proj", "v_proj"])
model.add_adapter(task_a_config, adapter_name="task_a")

# Add another adapter for a different task
task_b_config = LoraConfig(r=16, lora_alpha=32, target_modules=["q_proj", "v_proj"])
model.add_adapter(task_b_config, adapter_name="task_b")

# Switch between adapters
model.set_adapter("task_a")
# ... inference or training for task A ...

model.set_adapter("task_b")
# ... inference or training for task B ...

Related Pages

Implements Principle

Principle:Huggingface_Transformers_QLoRA_Fine_Tuning

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment