Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Microsoft LoRA Run GLUE LoRA Config

From Leeroopedia


Overview

Run GLUE LoRA Config is an API Doc for the LoRA configuration and injection logic within run_glue.py in the microsoft/LoRA repository. This documents the ModelArguments dataclass, the config creation step that propagates LoRA settings, and the parameter freezing logic that ensures only LoRA weights are trainable.

Source File

File Lines Description
examples/NLU/examples/text-classification/run_glue.py 142-219 ModelArguments dataclass definition
examples/NLU/examples/text-classification/run_glue.py 345-361 Config creation with LoRA settings
examples/NLU/examples/text-classification/run_glue.py 378-418 LoRA parameter freezing logic

CLI Flags

python examples/text-classification/run_glue.py \
    --model_name_or_path roberta-base \
    --task_name mnli \
    --apply_lora \
    --lora_r 8 \
    --lora_alpha 16 \
    --lora_path ./path/to/lora_weights.pt

Input / Output

Direction Description
Input Model name or path + LoRA hyperparameters (rank, alpha) via CLI flags
Output Model config with LoRA settings applied; model with lora.Linear layers in attention; frozen pretrained parameters

ModelArguments Dataclass

The ModelArguments dataclass (lines 142-219) defines all model-related CLI arguments, including LoRA-specific fields:

@dataclass
class ModelArguments:
    model_name_or_path: str = field(
        metadata={"help": "Path to pretrained model or model identifier from huggingface.co/models"}
    )
    config_name: Optional[str] = field(default=None)
    tokenizer_name: Optional[str] = field(default=None)
    cache_dir: Optional[str] = field(default=None)
    use_fast_tokenizer: bool = field(default=True)
    model_revision: str = field(default="main")
    use_auth_token: bool = field(default=False)

    # LoRA-specific arguments
    apply_lora: Optional[bool] = field(
        default=False,
        metadata={"help": "Whether to apply LoRA or not."},
    )
    lora_alpha: Optional[int] = field(
        default=None,
        metadata={"help": "LoRA alpha"},
    )
    lora_r: Optional[int] = field(
        default=None,
        metadata={"help": "LoRA r"},
    )
    lora_path: Optional[str] = field(
        default=None,
        metadata={"help": "The file path of LoRA parameters."},
    )

    # Other adaptation methods (for comparison)
    apply_adapter: Optional[bool] = field(default=False)
    adapter_path: Optional[str] = field(default=None)
    adapter_type: Optional[str] = field(default='houlsby')
    adapter_size: Optional[int] = field(default=64)
    apply_bitfit: Optional[bool] = field(default=False)
    reg_loss_wgt: Optional[float] = field(default=0.0)
    masking_prob: Optional[float] = field(default=0.0)

Config Creation (Lines 345-361)

The LoRA settings are propagated to the model via AutoConfig.from_pretrained():

config = AutoConfig.from_pretrained(
    model_args.config_name if model_args.config_name else model_args.model_name_or_path,
    num_labels=num_labels,
    finetuning_task=data_args.task_name,
    cache_dir=model_args.cache_dir,
    revision=model_args.model_revision,
    use_auth_token=True if model_args.use_auth_token else None,
    cls_dropout=training_args.cls_dropout,
    apply_lora=model_args.apply_lora,
    lora_alpha=model_args.lora_alpha,
    lora_r=model_args.lora_r,
    apply_adapter=model_args.apply_adapter,
    adapter_type=model_args.adapter_type,
    adapter_size=model_args.adapter_size,
    reg_loss_wgt=model_args.reg_loss_wgt,
    masking_prob=model_args.masking_prob,
)

The apply_lora, lora_alpha, and lora_r fields become attributes on the config object, which the model's self-attention layers read during __init__.

LoRA Weight Loading (Lines 379-385)

When a pretrained LoRA checkpoint is provided via --lora_path, the weights are loaded with strict=False:

if model_args.apply_lora:
    if model_args.lora_path is not None:
        lora_state_dict = torch.load(model_args.lora_path)
        logger.info(f"Apply LoRA state dict from {model_args.lora_path}.")
        logger.info(lora_state_dict.keys())
        model.load_state_dict(lora_state_dict, strict=False)
    trainable_params.append('lora')

The strict=False is essential because the LoRA checkpoint only contains LoRA parameters (and possibly the classifier head), not the full model weights. Missing keys (the pretrained backbone weights) are silently ignored.

Parameter Freezing (Lines 409-418)

After model creation and optional LoRA weight loading, all backbone parameters are frozen and only LoRA parameters are made trainable:

if len(trainable_params) > 0:
    for name, param in model.named_parameters():
        if name.startswith('deberta') or name.startswith('roberta'):
            param.requires_grad = False
            for trainable_param in trainable_params:
                if trainable_param in name:
                    param.requires_grad = True
                    break
        else:
            param.requires_grad = True

This logic:

  • Freezes all parameters whose names start with deberta or roberta (the pretrained backbone)
  • Unfreezes backbone parameters containing 'lora' in their name (the injected LoRA matrices)
  • Leaves trainable all parameters that do not belong to the backbone (e.g., the classification head)

Argument Parsing

All three argument groups are parsed by HfArgumentParser (line 227):

parser = HfArgumentParser((ModelArguments, DataTrainingArguments, TrainingArguments))
model_args, data_args, training_args = parser.parse_args_into_dataclasses()

This allows LoRA flags to coexist with standard HuggingFace TrainingArguments (learning rate, batch size, epochs, etc.) and DataTrainingArguments (task name, sequence length, etc.).

Related

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment