Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Unslothai Unsloth FastVisionModel Get Peft Model

From Leeroopedia
Revision as of 17:02, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Unslothai_Unsloth_FastVisionModel_Get_Peft_Model.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Vision, NLP, Parameter_Efficient_Finetuning
Last Updated 2026-02-07 00:00 GMT

Overview

Concrete tool for injecting LoRA adapters into vision-language models with selective vision/language layer targeting provided by the Unsloth library.

Description

FastVisionModel.get_peft_model (implemented in FastBaseModel.get_peft_model) extends standard LoRA injection with VLM-specific parameters for controlling which model components receive adapters. It uses get_peft_regex() to automatically detect vision vs. language layers and generates the correct PEFT target module pattern based on finetune_vision_layers, finetune_language_layers, finetune_attention_modules, and finetune_mlp_modules flags.

Usage

Call after FastVisionModel.from_pretrained and before configuring the trainer. For most VLM fine-tuning, enable both vision and language layers. For text-only adaptation of a VLM (e.g., changing output format), disable vision layers.

Code Reference

Source Location

  • Repository: unsloth
  • File: unsloth/models/vision.py
  • Lines: L941-1130

Signature

class FastBaseModel:
    @staticmethod
    def get_peft_model(
        model,
        r = 16,
        target_modules = None,
        lora_alpha = 16,
        lora_dropout = 0.0,
        bias = "none",
        finetune_vision_layers = True,
        finetune_language_layers = True,
        finetune_attention_modules = True,
        finetune_mlp_modules = True,
        layers_to_transform = None,
        layers_pattern = None,
        use_gradient_checkpointing = "unsloth",
        random_state = 3407,
        max_seq_length = 2048,
        use_rslora = False,
        modules_to_save = None,
        init_lora_weights = True,
        loftq_config = {},
        task_type = TaskType.CAUSAL_LM,
        temporary_location = "_unsloth_temporary_saved_buffers",
        qat_scheme = None,
        target_parameters = None,
        ensure_weight_tying = False,
        **kwargs,
    ) -> PeftModel:
        """
        Args:
            model: VLM from FastVisionModel.from_pretrained.
            finetune_vision_layers: Apply LoRA to vision encoder. Default True.
            finetune_language_layers: Apply LoRA to language decoder. Default True.
            finetune_attention_modules: Target attention projections. Default True.
            finetune_mlp_modules: Target MLP layers. Default True.
            target_modules: Auto-detected via get_peft_regex() if None.
        """

Import

from unsloth import FastVisionModel
# Called as: FastVisionModel.get_peft_model(model, ...)

I/O Contract

Inputs

Name Type Required Description
model PreTrainedModel Yes VLM from FastVisionModel.from_pretrained
r int No LoRA rank (default: 16)
finetune_vision_layers bool No Apply LoRA to vision encoder (default: True)
finetune_language_layers bool No Apply LoRA to language decoder (default: True)
finetune_attention_modules bool No Target attention projections (default: True)
finetune_mlp_modules bool No Target MLP layers (default: True)
lora_alpha int No LoRA scaling factor (default: 16)
use_gradient_checkpointing str No "unsloth" for optimized checkpointing

Outputs

Name Type Description
model PeftModel VLM with LoRA adapters on selected vision/language layers, gradient checkpointing configured

Usage Examples

Full VLM LoRA (Vision + Language)

from unsloth import FastVisionModel

model, processor = FastVisionModel.from_pretrained(
    model_name="unsloth/Qwen2-VL-7B-Instruct-bnb-4bit",
    load_in_4bit=True,
)

model = FastVisionModel.get_peft_model(
    model,
    r=16,
    finetune_vision_layers=True,
    finetune_language_layers=True,
    finetune_attention_modules=True,
    finetune_mlp_modules=True,
    use_gradient_checkpointing="unsloth",
)

Language-Only LoRA (Freeze Vision Encoder)

model = FastVisionModel.get_peft_model(
    model,
    r=16,
    finetune_vision_layers=False,   # Freeze vision encoder
    finetune_language_layers=True,   # Only adapt language decoder
)

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment