Implementation:Lm sys FastChat Make Delta Weights
| Knowledge Sources | |
|---|---|
| Domains | Model Weights, LLM, Weight Distribution |
| Last Updated | 2026-02-07 06:00 GMT |
Overview
Creates delta weights by computing the element-wise difference between a fine-tuned target model and its base model, enabling license-compliant weight distribution.
Description
The make_delta module provides the inverse operation of apply_delta. It takes a fully fine-tuned target model (e.g., Vicuna) and its original base model (e.g., LLaMA), then computes the element-wise difference delta = target - base across all model parameters. The resulting delta weights capture only what the fine-tuning process changed, and can be distributed separately without redistributing the proprietary base model weights.
Both models are loaded in float16 precision with low_cpu_mem_usage=True to minimize memory consumption during the subtraction. The function iterates over all parameters in the target model's state dictionary, asserts that each parameter name exists in the base model, and performs in-place subtraction (param.data -= base.state_dict()[name]). After computing the delta, it saves both the modified model weights and the target model's tokenizer to the specified output path.
The module also supports pushing the resulting delta weights directly to the HuggingFace Hub via the optional --hub-repo-id argument. When provided, both model.save_pretrained() and tokenizer.save_pretrained() are called with push_to_hub=True and the specified repository ID, automating the upload process.
Usage
Use this module when you have fine-tuned a model and want to distribute the weight differences rather than the full model. This is the standard workflow for releasing Vicuna model deltas. The output of this module is consumed by apply_delta to reconstruct the full model. Run it as python3 -m fastchat.model.make_delta with the required path arguments.
Code Reference
Source Location
- Repository: Lm_sys_FastChat
- File: fastchat/model/make_delta.py
- Lines: 1-48
Signature
def make_delta(base_model_path: str, target_model_path: str, delta_path: str) -> None:
"""Computes delta = target - base and saves the result."""
...
Import
from fastchat.model.make_delta import make_delta
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| base_model_path | str | Yes | Path to the base model weights directory or HuggingFace model ID (e.g., llama-13b) |
| target_model_path | str | Yes | Path to the fine-tuned target model weights directory (e.g., vicuna-13b) |
| delta_path | str | Yes | Output path where the computed delta weights will be saved |
| --hub-repo-id | str | No | HuggingFace Hub repository ID for pushing the delta (e.g., lmsys/vicuna-13b-delta-v1.1) |
Outputs
| Name | Type | Description |
|---|---|---|
| delta_path (on disk) | Directory | Saved delta model directory containing the weight differences and tokenizer files |
| HuggingFace Hub (optional) | Remote repository | Delta weights pushed to the specified Hub repository when --hub-repo-id is provided |
Usage Examples
# Command-line usage
# python3 -m fastchat.model.make_delta \
# --base ~/model_weights/llama-13b \
# --target ~/model_weights/vicuna-13b \
# --delta ~/model_weights/vicuna-13b-delta \
# --hub-repo-id lmsys/vicuna-13b-delta-v1.1
# Programmatic usage
from fastchat.model.make_delta import make_delta
make_delta(
base_model_path="~/model_weights/llama-13b",
target_model_path="~/model_weights/vicuna-13b",
delta_path="~/model_weights/vicuna-13b-delta"
)
Related Pages
- Principle:Lm_sys_FastChat_Model_Weight_Delta_Distribution
- Implements: Principle:Lm_sys_FastChat_Model_Weight_Delta_Distribution
- Lm_sys_FastChat_Apply_Delta_Weights - Reconstructs the target model from base + delta (the inverse operation)
- Lm_sys_FastChat_Huggingface_API_Inference - Runs inference on reconstructed models