Principle:OpenGVLab InternVL LoRA Adapter Merging

Knowledge Sources	LoRA PEFT merge_and_unload InternVL
Domains	Parameter_Efficient_Finetuning, Model_Deployment
Last Updated	2026-02-07 00:00 GMT

Overview

A weight consolidation technique that folds trained LoRA adapter matrices back into the base model weights, producing a standard model without adapter overhead.

Description

After LoRA fine-tuning, the model consists of frozen base weights plus separate adapter matrices. LoRA merging computes the effective weight $W^{'} = W + \frac{α}{r} B A$ and stores it as the new base weight, then removes the adapter wrappers. This produces a standard model that:

Has no inference overhead from adapter computation
Can be loaded without the PEFT library
Can serve as a base for further fine-tuning or deployment
Has the same architecture as the original pretrained model

Usage

Use this principle after LoRA fine-tuning when you want to deploy the model or use it as a base for further training without LoRA adapters.

Theoretical Basis

The merge operation computes:

$W_{m e r g e d} = W_{b a s e} + \frac{α}{r} \cdot B \cdot A$

For each LoRA-adapted layer, the adapter matrices are multiplied and added to the base weight, then the PEFT wrapper is removed. The resulting model is architecturally identical to the original pretrained model.

In InternVL, merging is done separately for:

Vision model LoRA (if use_backbone_lora was set)
Language model LoRA (if use_llm_lora was set)

After merging, the config flags are reset to 0 to indicate no adapters are present.

Related Pages

Implemented By

Implementation:OpenGVLab_InternVL_Merge_LoRA

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment