Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Norrrrrrr lyn WAInjectBench LoRA Adapter Injection

From Leeroopedia
Knowledge Sources
Domains Deep_Learning, Parameter_Efficient_Finetuning
Last Updated 2026-02-14 16:00 GMT

Overview

A parameter-efficient fine-tuning technique that injects low-rank adapter matrices into a frozen pre-trained model, enabling training with a fraction of the original parameter count.

Description

Low-Rank Adaptation (LoRA) adds trainable low-rank decomposition matrices A and B to selected weight matrices in a pre-trained model while keeping the original weights frozen. For a weight matrix W, the modified forward pass becomes h=Wx+αrBAx, where r is the rank and α is the scaling factor.

In the WAInjectBench LLaVA fine-tuning pipeline, LoRA is applied to a comprehensive set of target modules spanning both the language model and vision components: attention projections (q_proj, k_proj, v_proj, o_proj), MLP layers (gate_proj, up_proj, down_proj), and vision encoder layers (fc1, fc2, Wqkv, out_proj, proj, dense).

Usage

Use this when fine-tuning large models with limited GPU memory. LoRA reduces the trainable parameter count from billions to millions while maintaining most of the model's pre-trained knowledge.

Theoretical Basis

W=W+αrBA

Where:

  • Wd×k is the frozen pre-trained weight
  • Ar×k and Bd×r are the trainable low-rank matrices
  • rmin(d,k) is the rank (typically 4-64)
  • α is the scaling factor
# LoRA injection pseudocode
config = LoraConfig(r=rank, lora_alpha=alpha, target_modules=[...])
model = get_peft_model(model, config)
# Freeze all base params, enable only lora_* params

Related Pages

Implemented By

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment