Principle:Norrrrrrr lyn WAInjectBench LoRA Adapter Injection

Knowledge Sources	LoRA QLoRA
Domains	Deep_Learning, Parameter_Efficient_Finetuning
Last Updated	2026-02-14 16:00 GMT

Overview

A parameter-efficient fine-tuning technique that injects low-rank adapter matrices into a frozen pre-trained model, enabling training with a fraction of the original parameter count.

Description

Low-Rank Adaptation (LoRA) adds trainable low-rank decomposition matrices $A$ and $B$ to selected weight matrices in a pre-trained model while keeping the original weights frozen. For a weight matrix $W$ , the modified forward pass becomes $h = W x + \frac{α}{r} B A x$ , where $r$ is the rank and $α$ is the scaling factor.

In the WAInjectBench LLaVA fine-tuning pipeline, LoRA is applied to a comprehensive set of target modules spanning both the language model and vision components: attention projections (q_proj, k_proj, v_proj, o_proj), MLP layers (gate_proj, up_proj, down_proj), and vision encoder layers (fc1, fc2, Wqkv, out_proj, proj, dense).

Usage

Use this when fine-tuning large models with limited GPU memory. LoRA reduces the trainable parameter count from billions to millions while maintaining most of the model's pre-trained knowledge.

Theoretical Basis

$W^{'} = W + \frac{α}{r} B A$

Where:

$W \in ℝ^{d \times k}$ is the frozen pre-trained weight
$A \in ℝ^{r \times k}$ and $B \in ℝ^{d \times r}$ are the trainable low-rank matrices
$r ≪ \min (d, k)$ is the rank (typically 4-64)
$α$ is the scaling factor

# LoRA injection pseudocode
config = LoraConfig(r=rank, lora_alpha=alpha, target_modules=[...])
model = get_peft_model(model, config)
# Freeze all base params, enable only lora_* params

Related Pages

Implemented By

Implementation:Norrrrrrr_lyn_WAInjectBench_try_wrap_lora

Uses Heuristic

Heuristic:Norrrrrrr_lyn_WAInjectBench_LoRA_Rank_Alpha_Selection

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment