Implementation:Lm sys FastChat Make Delta Weights

Knowledge Sources	Lm_sys_FastChat
Domains	Model Weights, LLM, Weight Distribution
Last Updated	2026-02-07 06:00 GMT

Overview

Creates delta weights by computing the element-wise difference between a fine-tuned target model and its base model, enabling license-compliant weight distribution.

Description

The make_delta module provides the inverse operation of apply_delta. It takes a fully fine-tuned target model (e.g., Vicuna) and its original base model (e.g., LLaMA), then computes the element-wise difference delta = target - base across all model parameters. The resulting delta weights capture only what the fine-tuning process changed, and can be distributed separately without redistributing the proprietary base model weights.

Both models are loaded in float16 precision with low_cpu_mem_usage=True to minimize memory consumption during the subtraction. The function iterates over all parameters in the target model's state dictionary, asserts that each parameter name exists in the base model, and performs in-place subtraction (param.data -= base.state_dict()[name]). After computing the delta, it saves both the modified model weights and the target model's tokenizer to the specified output path.

The module also supports pushing the resulting delta weights directly to the HuggingFace Hub via the optional --hub-repo-id argument. When provided, both model.save_pretrained() and tokenizer.save_pretrained() are called with push_to_hub=True and the specified repository ID, automating the upload process.

Usage

Use this module when you have fine-tuned a model and want to distribute the weight differences rather than the full model. This is the standard workflow for releasing Vicuna model deltas. The output of this module is consumed by apply_delta to reconstruct the full model. Run it as python3 -m fastchat.model.make_delta with the required path arguments.

Code Reference

Source Location

Repository: Lm_sys_FastChat
File: fastchat/model/make_delta.py
Lines: 1-48

Signature

def make_delta(base_model_path: str, target_model_path: str, delta_path: str) -> None:
    """Computes delta = target - base and saves the result."""
    ...

Import

from fastchat.model.make_delta import make_delta

I/O Contract

Inputs

Name	Type	Required	Description
base_model_path	str	Yes	Path to the base model weights directory or HuggingFace model ID (e.g., llama-13b)
target_model_path	str	Yes	Path to the fine-tuned target model weights directory (e.g., vicuna-13b)
delta_path	str	Yes	Output path where the computed delta weights will be saved
--hub-repo-id	str	No	HuggingFace Hub repository ID for pushing the delta (e.g., lmsys/vicuna-13b-delta-v1.1)

Outputs

Name	Type	Description
delta_path (on disk)	Directory	Saved delta model directory containing the weight differences and tokenizer files
HuggingFace Hub (optional)	Remote repository	Delta weights pushed to the specified Hub repository when --hub-repo-id is provided

Usage Examples

# Command-line usage
# python3 -m fastchat.model.make_delta \
#     --base ~/model_weights/llama-13b \
#     --target ~/model_weights/vicuna-13b \
#     --delta ~/model_weights/vicuna-13b-delta \
#     --hub-repo-id lmsys/vicuna-13b-delta-v1.1

# Programmatic usage
from fastchat.model.make_delta import make_delta

make_delta(
    base_model_path="~/model_weights/llama-13b",
    target_model_path="~/model_weights/vicuna-13b",
    delta_path="~/model_weights/vicuna-13b-delta"
)

Related Pages

Principle:Lm_sys_FastChat_Model_Weight_Delta_Distribution
Implements: Principle:Lm_sys_FastChat_Model_Weight_Delta_Distribution
Lm_sys_FastChat_Apply_Delta_Weights - Reconstructs the target model from base + delta (the inverse operation)
Lm_sys_FastChat_Huggingface_API_Inference - Runs inference on reconstructed models

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment