Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Microsoft LoRA Mark Only LoRA Trainable

From Leeroopedia


Knowledge Sources
Domains Training, Parameter_Efficient_Fine_Tuning
Last Updated 2026-02-10 05:00 GMT

Overview

Utility function that freezes all model parameters except LoRA matrices and optionally biases.

Description

The mark_only_lora_as_trainable function iterates over all named parameters in a model and sets requires_grad = False for every parameter whose name does not contain the string "lora_". It then optionally re-enables gradients for bias parameters based on the specified bias mode. This is the mechanism that ensures only LoRA parameters receive gradient updates during training.

Usage

Call this function once after model construction (with LoRA layers in place) and before creating the optimizer. The bias parameter should match the bias mode used in lora_state_dict for checkpoint consistency.

Code Reference

Source Location

Signature

def mark_only_lora_as_trainable(model: nn.Module, bias: str = 'none') -> None:
    """Freeze all parameters except LoRA matrices and optionally biases.

    Args:
        model: The PyTorch model containing LoRA layers
        bias: Bias handling mode - 'none', 'all', or 'lora_only'

    Returns:
        None (modifies model in-place)
    """

Import

from loralib import mark_only_lora_as_trainable
# or
import loralib as lora
# then use lora.mark_only_lora_as_trainable

I/O Contract

Inputs

Name Type Required Description
model nn.Module Yes PyTorch model containing LoRA-augmented layers
bias str No (default 'none') Bias handling mode: 'none', 'all', or 'lora_only'

Outputs

Name Type Description
None None Modifies model in-place by setting requires_grad on parameters

Bias Mode Details

bias Value Parameters with requires_grad=True
none Only parameters with "lora_" in name
all Parameters with "lora_" in name + all parameters with "bias" in name
lora_only Parameters with "lora_" in name + bias parameters in LoRA-augmented modules only

Implementation Details

The function operates in two passes:

Pass 1: Iterate over all named parameters. Set requires_grad = False for any parameter whose name does not contain "lora_".

Pass 2 (if bias != 'none'): Re-enable gradients for bias parameters according to the mode:

  • all: Set requires_grad = True for any parameter with "bias" in its name
  • lora_only: Set requires_grad = True for bias parameters only in modules that also contain lora_ parameters

Usage Examples

Basic Usage (No Bias Training)

import loralib as lora

# After model construction with LoRA layers
lora.mark_only_lora_as_trainable(model)

# Verify: count trainable parameters
trainable = sum(p.numel() for p in model.parameters() if p.requires_grad)
total = sum(p.numel() for p in model.parameters())
print(f"Trainable: {trainable:,} / {total:,} ({100 * trainable / total:.2f}%)")
# Example output: Trainable: 294,912 / 124,439,808 (0.24%)

With All Biases Trainable

import loralib as lora

lora.mark_only_lora_as_trainable(model, bias='all')

With Only LoRA Layer Biases Trainable

import loralib as lora

lora.mark_only_lora_as_trainable(model, bias='lora_only')

Complete Workflow

import torch
import loralib as lora

# 1. Build model with LoRA layers
model = build_model_with_lora(r=8, lora_alpha=16)

# 2. Freeze non-LoRA parameters
lora.mark_only_lora_as_trainable(model, bias='none')

# 3. Create optimizer (only receives LoRA params due to requires_grad filtering)
optimizer = torch.optim.AdamW(
    filter(lambda p: p.requires_grad, model.parameters()),
    lr=2e-4
)

# 4. Train as usual
for batch in dataloader:
    loss = model(batch)
    loss.backward()
    optimizer.step()
    optimizer.zero_grad()

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment