Principle:OpenGVLab InternVL LoRA Adapter Merging
| Knowledge Sources | |
|---|---|
| Domains | Parameter_Efficient_Finetuning, Model_Deployment |
| Last Updated | 2026-02-07 00:00 GMT |
Overview
A weight consolidation technique that folds trained LoRA adapter matrices back into the base model weights, producing a standard model without adapter overhead.
Description
After LoRA fine-tuning, the model consists of frozen base weights plus separate adapter matrices. LoRA merging computes the effective weight and stores it as the new base weight, then removes the adapter wrappers. This produces a standard model that:
- Has no inference overhead from adapter computation
- Can be loaded without the PEFT library
- Can serve as a base for further fine-tuning or deployment
- Has the same architecture as the original pretrained model
Usage
Use this principle after LoRA fine-tuning when you want to deploy the model or use it as a base for further training without LoRA adapters.
Theoretical Basis
The merge operation computes:
For each LoRA-adapted layer, the adapter matrices are multiplied and added to the base weight, then the PEFT wrapper is removed. The resulting model is architecturally identical to the original pretrained model.
In InternVL, merging is done separately for:
- Vision model LoRA (if use_backbone_lora was set)
- Language model LoRA (if use_llm_lora was set)
After merging, the config flags are reset to 0 to indicate no adapters are present.