Implementation:Huggingface Peft AdaLoraConfig
Metadata
| Field | Value |
|---|---|
| Source | PEFT | https://github.com/huggingface/peft |
| Domains | Deep_Learning, Parameter_Efficient_Finetuning |
| Last Updated | 2026-02-07 00:00 GMT |
Overview
AdaLoraConfig is the configuration dataclass for the AdaLoRA (Adaptive Low-Rank Adaptation) PEFT method. It extends LoraConfig with additional hyperparameters that control the adaptive rank allocation mechanism: the SVD-based importance scoring, the three-phase training schedule, and the orthogonal regularization. This configuration must be provided to get_peft_model to create an AdaLoRA-adapted model.
Source
File: src/peft/tuners/adalora/config.py, lines 24-109
Repository: huggingface/peft
Signature
@dataclass
class AdaLoraConfig(LoraConfig):
target_r: int = 8
init_r: int = 12
tinit: int = 0
tfinal: int = 0
deltaT: int = 1
beta1: float = 0.85
beta2: float = 0.85
orth_reg_weight: float = 0.5
total_step: Optional[int] = None
rank_pattern: Optional[dict] = None
Import
from peft import AdaLoraConfig
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
target_r |
int |
8 |
The target average rank of the incremental matrices after pruning. This is the rank each adapted layer will converge toward by the end of the rank reduction phase. The total target budget is target_r * n_adapted_layers.
|
init_r |
int |
12 |
The initial rank for each incremental matrix at the start of training. Must be greater than or equal to target_r. The difference init_r - target_r determines how much rank is available for pruning.
|
tinit |
int |
0 |
Number of initial warmup steps before rank reduction begins. During these steps, all layers maintain their full init_r rank. This allows importance scores to accumulate before pruning decisions are made. Must be less than total_step - tfinal.
|
tfinal |
int |
0 |
Number of final fine-tuning steps after rank reduction ends. During these steps, the rank allocation is frozen and the model fine-tunes with its pruned configuration. Must satisfy tinit < total_step - tfinal.
|
deltaT |
int |
1 |
Step interval between budget allocation updates. Rank pruning is only performed every deltaT steps during the reduction phase. Higher values reduce the frequency of expensive masking operations but make rank reduction coarser.
|
beta1 |
float |
0.85 |
Hyperparameter for the exponential moving average (EMA) of importance sensitivity. Controls how quickly the smoothed importance score adapts to recent gradient information. Must be in (0, 1). Higher values give more weight to historical importance. |
beta2 |
float |
0.85 |
Hyperparameter for the EMA of importance uncertainty. Controls the smoothing of the variance estimate used in uncertainty quantification. Must be in (0, 1). Higher values produce more stable uncertainty estimates. |
orth_reg_weight |
float |
0.5 |
Coefficient for the orthogonal regularization loss applied to the P and Q matrices of the SVD triplet. This regularization encourages orthogonality in the singular vector matrices. Set to 0.0 to disable. |
total_step |
Optional[int] |
None |
The total number of training steps. Must be specified before training begins. Used to compute the three-phase schedule. Raises ValueError if None or <= 0.
|
rank_pattern |
Optional[dict] |
None |
The allocated rank pattern for each weight matrix, as determined by the RankAllocator during training. This is populated automatically during training and can be saved/loaded for inference. Not typically set by the user. |
In addition to these AdaLoRA-specific parameters, AdaLoraConfig inherits all parameters from LoraConfig including lora_alpha, lora_dropout, target_modules, task_type, and others. Note that the inherited r parameter is not used by AdaLoRA and will trigger a warning if set to a non-default value; use init_r instead.
Behavior
On initialization (__post_init__), AdaLoraConfig performs the following:
- Sets
peft_typetoPeftType.ADALORA - Validates that DoRA is not enabled (AdaLoRA does not support DoRA)
- Validates that LoftQ is not enabled (AdaLoRA does not support LoftQ)
- Converts
target_modulesto asetif provided as a list - Converts
exclude_modulesto asetif provided as a list - Validates that
layers_to_transformis not used with regex-basedtarget_modules - Validates that
layers_patternis accompanied bylayers_to_transform - Emits a warning if
ris set to a non-default value (should useinit_rinstead) - Validates that
total_stepis notNoneand is greater than 0 - Validates that
tinit < total_step - tfinal(ensuring a valid budget reduction phase exists)
Usage Example
Basic AdaLoRA configuration:
from peft import AdaLoraConfig, get_peft_model
config = AdaLoraConfig(
init_r=12,
target_r=4,
tinit=200,
tfinal=1000,
deltaT=10,
total_step=10000,
beta1=0.85,
beta2=0.85,
orth_reg_weight=0.5,
lora_alpha=32,
lora_dropout=0.1,
target_modules=["q_proj", "v_proj"],
task_type="CAUSAL_LM",
)
model = get_peft_model(base_model, config)
Aggressive rank reduction:
# Start with high rank, prune aggressively to a low target
config = AdaLoraConfig(
init_r=64,
target_r=4,
tinit=500,
tfinal=2000,
deltaT=5,
total_step=20000,
lora_alpha=64,
target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
task_type="SEQ_CLS",
)
Practical example with schedule arithmetic:
# tinit=10, tfinal=20, total_step=100
# Phase 1 (warmup): steps 0-10, no rank reduction, budget = init_bgt
# Phase 2 (reduction): steps 10-80, budget decreases cubically
# Phase 3 (fine-tuning): steps 80-100, budget frozen at target_bgt
config = AdaLoraConfig(
init_r=12,
target_r=8,
tinit=10,
tfinal=20,
total_step=100,
target_modules=["q_proj", "v_proj"],
task_type="CAUSAL_LM",
)
Edge Cases and Notes
- total_step is required: Unlike LoRA, AdaLoRA requires knowing the total training steps upfront. This is because the cubic budget schedule needs to compute the rate of rank decrease. Passing
Noneor a value <= 0 raises aValueError. - Schedule validation: The constraint
tinit < total_step - tfinalis enforced. If violated, there would be no steps allocated to the rank reduction phase, making AdaLoRA degenerate to standard LoRA with fixed rank. - r parameter ignored: The LoRA
rparameter inherited fromLoraConfigis not used in AdaLoRA. Setting it to a non-default value triggers a warning directing users toinit_r. - DoRA incompatibility: AdaLoRA does not support weight-decomposed LoRA (DoRA). Enabling
use_dora=Trueraises aValueError. - LoftQ incompatibility: AdaLoRA does not support LoftQ initialization. Providing a
loftq_configraises aValueError.