Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Unslothai Unsloth Vision LoRA Injection

From Leeroopedia


Knowledge Sources
Domains Vision, NLP, Parameter_Efficient_Finetuning
Last Updated 2026-02-07 00:00 GMT

Overview

A parameter-efficient fine-tuning technique that selectively injects LoRA adapters into vision encoder and/or language decoder components of vision-language models.

Description

Vision LoRA injection extends the standard LoRA principle to multimodal architectures. The key distinction is selective layer targeting: VLMs have separate vision and language towers, and the practitioner must choose which components to adapt:

  1. Vision Layers: Applying LoRA to the vision encoder (e.g., ViT attention/MLP) for learning new visual representations.
  2. Language Layers: Applying LoRA to the language decoder for learning new text generation behaviors.
  3. Attention vs. MLP: Fine-grained control over whether LoRA targets attention projections, MLP layers, or both.

The get_peft_regex utility automatically detects which layers belong to vision vs. language towers and generates the appropriate PEFT target module regex based on the user's preferences.

Usage

Apply this principle when fine-tuning vision-language models. Set finetune_vision_layers=True to adapt the vision encoder (necessary for tasks requiring new visual understanding, like OCR on new fonts). Set finetune_language_layers=True for text generation adaptation. Both can be enabled simultaneously.

Theoretical Basis

The LoRA mathematics are identical to text-only LoRA (see LoRA_Adapter_Injection), but applied selectively:

# Abstract selective LoRA for VLMs
target_modules = []
if finetune_vision_layers:
    target_modules += vision_encoder.attention_and_mlp_layers
if finetune_language_layers:
    target_modules += language_decoder.attention_and_mlp_layers

# Apply LoRA only to selected targets
for layer in target_modules:
    layer.weight = W_frozen + (alpha/r) * B @ A

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment