Implementation:Huggingface Peft AdaLoraConfig

Metadata

Field	Value
Source	PEFT \| https://github.com/huggingface/peft
Domains	Deep_Learning, Parameter_Efficient_Finetuning
Last Updated	2026-02-07 00:00 GMT

Overview

AdaLoraConfig is the configuration dataclass for the AdaLoRA (Adaptive Low-Rank Adaptation) PEFT method. It extends LoraConfig with additional hyperparameters that control the adaptive rank allocation mechanism: the SVD-based importance scoring, the three-phase training schedule, and the orthogonal regularization. This configuration must be provided to get_peft_model to create an AdaLoRA-adapted model.

Source

File: src/peft/tuners/adalora/config.py, lines 24-109

Repository: huggingface/peft

Signature

@dataclass
class AdaLoraConfig(LoraConfig):
    target_r: int = 8
    init_r: int = 12
    tinit: int = 0
    tfinal: int = 0
    deltaT: int = 1
    beta1: float = 0.85
    beta2: float = 0.85
    orth_reg_weight: float = 0.5
    total_step: Optional[int] = None
    rank_pattern: Optional[dict] = None

Import

from peft import AdaLoraConfig

Parameters

Parameter	Type	Default	Description
`target_r`	`int`	`8`	The target average rank of the incremental matrices after pruning. This is the rank each adapted layer will converge toward by the end of the rank reduction phase. The total target budget is `target_r * n_adapted_layers`.
`init_r`	`int`	`12`	The initial rank for each incremental matrix at the start of training. Must be greater than or equal to `target_r`. The difference `init_r - target_r` determines how much rank is available for pruning.
`tinit`	`int`	`0`	Number of initial warmup steps before rank reduction begins. During these steps, all layers maintain their full `init_r` rank. This allows importance scores to accumulate before pruning decisions are made. Must be less than `total_step - tfinal`.
`tfinal`	`int`	`0`	Number of final fine-tuning steps after rank reduction ends. During these steps, the rank allocation is frozen and the model fine-tunes with its pruned configuration. Must satisfy `tinit < total_step - tfinal`.
`deltaT`	`int`	`1`	Step interval between budget allocation updates. Rank pruning is only performed every `deltaT` steps during the reduction phase. Higher values reduce the frequency of expensive masking operations but make rank reduction coarser.
`beta1`	`float`	`0.85`	Hyperparameter for the exponential moving average (EMA) of importance sensitivity. Controls how quickly the smoothed importance score adapts to recent gradient information. Must be in (0, 1). Higher values give more weight to historical importance.
`beta2`	`float`	`0.85`	Hyperparameter for the EMA of importance uncertainty. Controls the smoothing of the variance estimate used in uncertainty quantification. Must be in (0, 1). Higher values produce more stable uncertainty estimates.
`orth_reg_weight`	`float`	`0.5`	Coefficient for the orthogonal regularization loss applied to the P and Q matrices of the SVD triplet. This regularization encourages orthogonality in the singular vector matrices. Set to 0.0 to disable.
`total_step`	`Optional[int]`	`None`	The total number of training steps. Must be specified before training begins. Used to compute the three-phase schedule. Raises `ValueError` if `None` or <= 0.
`rank_pattern`	`Optional[dict]`	`None`	The allocated rank pattern for each weight matrix, as determined by the RankAllocator during training. This is populated automatically during training and can be saved/loaded for inference. Not typically set by the user.

In addition to these AdaLoRA-specific parameters, AdaLoraConfig inherits all parameters from LoraConfig including lora_alpha, lora_dropout, target_modules, task_type, and others. Note that the inherited r parameter is not used by AdaLoRA and will trigger a warning if set to a non-default value; use init_r instead.

Behavior

On initialization (__post_init__), AdaLoraConfig performs the following:

Sets peft_type to PeftType.ADALORA
Validates that DoRA is not enabled (AdaLoRA does not support DoRA)
Validates that LoftQ is not enabled (AdaLoRA does not support LoftQ)
Converts target_modules to a set if provided as a list
Converts exclude_modules to a set if provided as a list
Validates that layers_to_transform is not used with regex-based target_modules
Validates that layers_pattern is accompanied by layers_to_transform
Emits a warning if r is set to a non-default value (should use init_r instead)
Validates that total_step is not None and is greater than 0
Validates that tinit < total_step - tfinal (ensuring a valid budget reduction phase exists)

Usage Example

Basic AdaLoRA configuration:

from peft import AdaLoraConfig, get_peft_model

config = AdaLoraConfig(
    init_r=12,
    target_r=4,
    tinit=200,
    tfinal=1000,
    deltaT=10,
    total_step=10000,
    beta1=0.85,
    beta2=0.85,
    orth_reg_weight=0.5,
    lora_alpha=32,
    lora_dropout=0.1,
    target_modules=["q_proj", "v_proj"],
    task_type="CAUSAL_LM",
)

model = get_peft_model(base_model, config)

Aggressive rank reduction:

# Start with high rank, prune aggressively to a low target
config = AdaLoraConfig(
    init_r=64,
    target_r=4,
    tinit=500,
    tfinal=2000,
    deltaT=5,
    total_step=20000,
    lora_alpha=64,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
    task_type="SEQ_CLS",
)

Practical example with schedule arithmetic:

# tinit=10, tfinal=20, total_step=100
# Phase 1 (warmup): steps 0-10, no rank reduction, budget = init_bgt
# Phase 2 (reduction): steps 10-80, budget decreases cubically
# Phase 3 (fine-tuning): steps 80-100, budget frozen at target_bgt
config = AdaLoraConfig(
    init_r=12,
    target_r=8,
    tinit=10,
    tfinal=20,
    total_step=100,
    target_modules=["q_proj", "v_proj"],
    task_type="CAUSAL_LM",
)

Edge Cases and Notes

total_step is required: Unlike LoRA, AdaLoRA requires knowing the total training steps upfront. This is because the cubic budget schedule needs to compute the rate of rank decrease. Passing None or a value <= 0 raises a ValueError.
Schedule validation: The constraint tinit < total_step - tfinal is enforced. If violated, there would be no steps allocated to the rank reduction phase, making AdaLoRA degenerate to standard LoRA with fixed rank.
r parameter ignored: The LoRA r parameter inherited from LoraConfig is not used in AdaLoRA. Setting it to a non-default value triggers a warning directing users to init_r.
DoRA incompatibility: AdaLoRA does not support weight-decomposed LoRA (DoRA). Enabling use_dora=True raises a ValueError.
LoftQ incompatibility: AdaLoRA does not support LoftQ initialization. Providing a loftq_config raises a ValueError.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment