Implementation:Huggingface Peft Update And Allocate
Metadata
| Field | Value |
|---|---|
| Source | PEFT | https://github.com/huggingface/peft |
| Domains | Deep_Learning, Parameter_Efficient_Finetuning |
| Last Updated | 2026-02-07 00:00 GMT |
Overview
update_and_allocate is the method on AdaLoraModel that orchestrates the runtime rank allocation mechanism for AdaLoRA. It is the primary interface between the user's training loop and the internal RankAllocator class, handling the three-phase training schedule (warmup, rank reduction, final fine-tuning) and delegating importance score updates and budget masking to the allocator. This method must be called at every training step after loss.backward() and before optimizer.zero_grad().
Source
Files:
src/peft/tuners/adalora/model.py, lines 294-334 --AdaLoraModel.update_and_allocatemethodsrc/peft/tuners/adalora/layer.py, lines 196-362 --RankAllocatorclass
Repository: huggingface/peft
Signature
def update_and_allocate(self, global_step: int) -> None:
"""
This method updates Adalora budget and mask.
This should be called in every training step after loss.backward() and before zero_grad().
tinit, tfinal and deltaT are handled within the method.
Args:
global_step (int): The current training step, used to calculate adalora budget.
"""
Import
This is a method on the AdaLoraModel class, accessed via the base_model attribute of a PEFT-wrapped model. It is not imported directly.
# Access via model.base_model
model.base_model.update_and_allocate(global_step)
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
global_step |
int |
required | The current training step (0-indexed). Used to determine which phase of the three-phase schedule the training is in and to compute the current budget via the cubic decay schedule. |
Return Value
Returns None. The method operates by mutating the model's parameters in-place (masking singular values to zero) and updating the internal state of the RankAllocator.
Behavior
The method implements three distinct code paths based on the current global_step relative to the schedule parameters total_step and tfinal:
Phase 1 and 2: Before Final Fine-tuning
When global_step < total_step - tfinal:
if global_step < lora_config.total_step - lora_config.tfinal:
_, rank_pattern = self.rankallocator.update_and_allocate(self.model, global_step)
if rank_pattern:
lora_config.rank_pattern = rank_pattern
This delegates to RankAllocator.update_and_allocate, which:
- Calls
update_ipt(model)to update importance scores using the current gradients - Calls
budget_schedule(global_step)to compute the current budget and whether masking should be applied - If masking is triggered (every
deltaTsteps during the reduction phase), callsmask_to_budget(model, budget)to zero out the least important singular values - Returns the budget and the resulting rank pattern
During the warmup phase (step <= tinit), importance scores are still updated but mask_ind is False, so no masking is applied.
Transition to Final Phase
When global_step == total_step - tfinal:
elif global_step == lora_config.total_step - lora_config.tfinal:
_, rank_pattern = self.rankallocator.update_and_allocate(self.model, global_step, force_mask=True)
lora_config.rank_pattern = rank_pattern
self.rankallocator.reset_ipt()
This is the transition step where:
- A final masking pass is performed with
force_mask=True, ensuring the budget is enforced regardless of thedeltaTinterval - The resulting rank pattern is saved to the config (for later serialization)
- The importance tracking state is reset via
reset_ipt()to free memory
Final Fine-tuning Phase
When global_step > total_step - tfinal:
elif global_step > lora_config.total_step - lora_config.tfinal:
self.rankallocator.mask_using_rank_pattern(self.model, lora_config.rank_pattern)
During the final phase, no importance scores are computed. Instead, the previously-saved rank_pattern is applied directly to mask the appropriate singular values. This is more efficient than recomputing importance but achieves the same result: maintaining the frozen rank allocation.
The RankAllocator Class
The RankAllocator class (defined in layer.py) manages the internal state for importance scoring and budget scheduling:
Initialization:
class RankAllocator:
def __init__(self, model, peft_config, adapter_name):
self.beta1 = peft_config.beta1
self.beta2 = peft_config.beta2
self.reset_ipt()
self._set_budget_scheduler(model)
Internal state:
ipt: Dictionary of instantaneous importance scores per parameterexp_avg_ipt: Dictionary of EMA-smoothed importance (sensitivity)exp_avg_unc: Dictionary of EMA-smoothed uncertaintyinit_bgt: Total initial budget (init_r * n_layers)target_bgt: Total target budget (target_r * n_layers)name_set: Sorted set of adapted layer names
Importance update (update_ipt):
def update_ipt(self, model):
for n, p in model.named_parameters():
if "lora_" in n and self.adapter_name in n:
with torch.no_grad():
self.ipt[n] = (p * p.grad).abs().detach()
# Sensitivity smoothing
self.exp_avg_ipt[n] = self.beta1 * self.exp_avg_ipt[n] + (1 - self.beta1) * self.ipt[n]
# Uncertainty quantification
self.exp_avg_unc[n] = (
self.beta2 * self.exp_avg_unc[n]
+ (1 - self.beta2) * (self.ipt[n] - self.exp_avg_ipt[n]).abs()
)
Budget schedule (budget_schedule):
def budget_schedule(self, step: int):
tinit = self.peft_config.tinit
tfinal = self.peft_config.tfinal
total_step = self.peft_config.total_step
if step <= tinit:
budget = self.init_bgt
mask_ind = False
elif step > total_step - tfinal:
budget = self.target_bgt
mask_ind = True
else:
mul_coeff = 1 - (step - tinit) / (total_step - tfinal - tinit)
budget = int((self.init_bgt - self.target_bgt) * (mul_coeff ** 3) + self.target_bgt)
mask_ind = True if step % self.peft_config.deltaT == 0 else False
return budget, mask_ind
Masking (mask_to_budget):
The masking process aggregates triplet importance scores across all adapted layers, computes a global threshold via torch.kthvalue, and zeros out singular values (lora_E entries) that fall below the threshold using masked_fill_.
Usage Example
Standard training loop with AdaLoRA:
from peft import AdaLoraConfig, get_peft_model
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Setup
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-hf")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-hf")
total_steps = 10000
config = AdaLoraConfig(
init_r=12,
target_r=4,
tinit=200,
tfinal=1000,
deltaT=10,
total_step=total_steps,
lora_alpha=32,
target_modules=["q_proj", "v_proj"],
task_type="CAUSAL_LM",
)
model = get_peft_model(model, config)
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-4)
# Training loop
for step, batch in enumerate(dataloader):
# Forward pass
outputs = model(**batch)
loss = outputs.loss
# Backward pass
loss.backward()
# Optimizer step
optimizer.step()
# CRITICAL: update_and_allocate must be called here
# After backward (gradients available) and before zero_grad (gradients cleared)
model.base_model.update_and_allocate(step)
# Clear gradients
optimizer.zero_grad()
if step >= total_steps:
break
Monitoring rank allocation during training:
for step, batch in enumerate(dataloader):
outputs = model(**batch)
loss = outputs.loss
loss.backward()
optimizer.step()
model.base_model.update_and_allocate(step)
optimizer.zero_grad()
# Log the current rank pattern periodically
if step % 500 == 0:
rank_pattern = model.peft_config["default"].rank_pattern
if rank_pattern:
for name, mask in rank_pattern.items():
active_rank = sum(mask)
total_rank = len(mask)
print(f"Step {step} | {name}: rank {active_rank}/{total_rank}")
Edge Cases and Notes
- Call ordering is critical: The method must be called after
loss.backward()(so gradients are available for importance scoring) and beforeoptimizer.zero_grad()(so gradients are not erased before being read). Calling in the wrong order will produce zero importance scores and incorrect rank allocation. - DeepSpeed compatibility: When DeepSpeed is detected, the importance update uses
deepspeed.utils.safe_get_full_grad(p)instead ofp.gradto correctly retrieve gradients in distributed training settings with ZeRO optimization. - Global step tracking: The
global_stepmust be a monotonically increasing integer starting from 0. Passing incorrect step values (e.g., epoch-level counters instead of step-level counters) will produce incorrect budget scheduling. - Mask application in final phase: During the final fine-tuning phase, the method applies the saved
rank_patternat every step usingmask_using_rank_pattern. This is necessary because the optimizer may update the masked singular values (they still have gradients from the adapter parameters), so the mask must be reapplied after each optimizer step. - rank_pattern serialization: The
rank_patternstored in the config uses a dictionary mapping parameter names to boolean lists. When the adapter name is truncated from parameter names (e.g., during checkpoint loading), the mask application code handles this by checking and stripping the adapter name suffix.