Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Huggingface Peft AdaLoraConfig

From Leeroopedia


Metadata

Field Value
Source PEFT | https://github.com/huggingface/peft
Domains Deep_Learning, Parameter_Efficient_Finetuning
Last Updated 2026-02-07 00:00 GMT

Overview

AdaLoraConfig is the configuration dataclass for the AdaLoRA (Adaptive Low-Rank Adaptation) PEFT method. It extends LoraConfig with additional hyperparameters that control the adaptive rank allocation mechanism: the SVD-based importance scoring, the three-phase training schedule, and the orthogonal regularization. This configuration must be provided to get_peft_model to create an AdaLoRA-adapted model.

Source

File: src/peft/tuners/adalora/config.py, lines 24-109

Repository: huggingface/peft

Signature

@dataclass
class AdaLoraConfig(LoraConfig):
    target_r: int = 8
    init_r: int = 12
    tinit: int = 0
    tfinal: int = 0
    deltaT: int = 1
    beta1: float = 0.85
    beta2: float = 0.85
    orth_reg_weight: float = 0.5
    total_step: Optional[int] = None
    rank_pattern: Optional[dict] = None

Import

from peft import AdaLoraConfig

Parameters

Parameter Type Default Description
target_r int 8 The target average rank of the incremental matrices after pruning. This is the rank each adapted layer will converge toward by the end of the rank reduction phase. The total target budget is target_r * n_adapted_layers.
init_r int 12 The initial rank for each incremental matrix at the start of training. Must be greater than or equal to target_r. The difference init_r - target_r determines how much rank is available for pruning.
tinit int 0 Number of initial warmup steps before rank reduction begins. During these steps, all layers maintain their full init_r rank. This allows importance scores to accumulate before pruning decisions are made. Must be less than total_step - tfinal.
tfinal int 0 Number of final fine-tuning steps after rank reduction ends. During these steps, the rank allocation is frozen and the model fine-tunes with its pruned configuration. Must satisfy tinit < total_step - tfinal.
deltaT int 1 Step interval between budget allocation updates. Rank pruning is only performed every deltaT steps during the reduction phase. Higher values reduce the frequency of expensive masking operations but make rank reduction coarser.
beta1 float 0.85 Hyperparameter for the exponential moving average (EMA) of importance sensitivity. Controls how quickly the smoothed importance score adapts to recent gradient information. Must be in (0, 1). Higher values give more weight to historical importance.
beta2 float 0.85 Hyperparameter for the EMA of importance uncertainty. Controls the smoothing of the variance estimate used in uncertainty quantification. Must be in (0, 1). Higher values produce more stable uncertainty estimates.
orth_reg_weight float 0.5 Coefficient for the orthogonal regularization loss applied to the P and Q matrices of the SVD triplet. This regularization encourages orthogonality in the singular vector matrices. Set to 0.0 to disable.
total_step Optional[int] None The total number of training steps. Must be specified before training begins. Used to compute the three-phase schedule. Raises ValueError if None or <= 0.
rank_pattern Optional[dict] None The allocated rank pattern for each weight matrix, as determined by the RankAllocator during training. This is populated automatically during training and can be saved/loaded for inference. Not typically set by the user.

In addition to these AdaLoRA-specific parameters, AdaLoraConfig inherits all parameters from LoraConfig including lora_alpha, lora_dropout, target_modules, task_type, and others. Note that the inherited r parameter is not used by AdaLoRA and will trigger a warning if set to a non-default value; use init_r instead.

Behavior

On initialization (__post_init__), AdaLoraConfig performs the following:

  1. Sets peft_type to PeftType.ADALORA
  2. Validates that DoRA is not enabled (AdaLoRA does not support DoRA)
  3. Validates that LoftQ is not enabled (AdaLoRA does not support LoftQ)
  4. Converts target_modules to a set if provided as a list
  5. Converts exclude_modules to a set if provided as a list
  6. Validates that layers_to_transform is not used with regex-based target_modules
  7. Validates that layers_pattern is accompanied by layers_to_transform
  8. Emits a warning if r is set to a non-default value (should use init_r instead)
  9. Validates that total_step is not None and is greater than 0
  10. Validates that tinit < total_step - tfinal (ensuring a valid budget reduction phase exists)

Usage Example

Basic AdaLoRA configuration:

from peft import AdaLoraConfig, get_peft_model

config = AdaLoraConfig(
    init_r=12,
    target_r=4,
    tinit=200,
    tfinal=1000,
    deltaT=10,
    total_step=10000,
    beta1=0.85,
    beta2=0.85,
    orth_reg_weight=0.5,
    lora_alpha=32,
    lora_dropout=0.1,
    target_modules=["q_proj", "v_proj"],
    task_type="CAUSAL_LM",
)

model = get_peft_model(base_model, config)

Aggressive rank reduction:

# Start with high rank, prune aggressively to a low target
config = AdaLoraConfig(
    init_r=64,
    target_r=4,
    tinit=500,
    tfinal=2000,
    deltaT=5,
    total_step=20000,
    lora_alpha=64,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
    task_type="SEQ_CLS",
)

Practical example with schedule arithmetic:

# tinit=10, tfinal=20, total_step=100
# Phase 1 (warmup): steps 0-10, no rank reduction, budget = init_bgt
# Phase 2 (reduction): steps 10-80, budget decreases cubically
# Phase 3 (fine-tuning): steps 80-100, budget frozen at target_bgt
config = AdaLoraConfig(
    init_r=12,
    target_r=8,
    tinit=10,
    tfinal=20,
    total_step=100,
    target_modules=["q_proj", "v_proj"],
    task_type="CAUSAL_LM",
)

Edge Cases and Notes

  • total_step is required: Unlike LoRA, AdaLoRA requires knowing the total training steps upfront. This is because the cubic budget schedule needs to compute the rate of rank decrease. Passing None or a value <= 0 raises a ValueError.
  • Schedule validation: The constraint tinit < total_step - tfinal is enforced. If violated, there would be no steps allocated to the rank reduction phase, making AdaLoRA degenerate to standard LoRA with fixed rank.
  • r parameter ignored: The LoRA r parameter inherited from LoraConfig is not used in AdaLoRA. Setting it to a non-default value triggers a warning directing users to init_r.
  • DoRA incompatibility: AdaLoRA does not support weight-decomposed LoRA (DoRA). Enabling use_dora=True raises a ValueError.
  • LoftQ incompatibility: AdaLoRA does not support LoftQ initialization. Providing a loftq_config raises a ValueError.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment