Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Lm sys FastChat Make Delta Weights

From Leeroopedia
Revision as of 15:34, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Lm_sys_FastChat_Make_Delta_Weights.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Model Weights, LLM, Weight Distribution
Last Updated 2026-02-07 06:00 GMT

Overview

Creates delta weights by computing the element-wise difference between a fine-tuned target model and its base model, enabling license-compliant weight distribution.

Description

The make_delta module provides the inverse operation of apply_delta. It takes a fully fine-tuned target model (e.g., Vicuna) and its original base model (e.g., LLaMA), then computes the element-wise difference delta = target - base across all model parameters. The resulting delta weights capture only what the fine-tuning process changed, and can be distributed separately without redistributing the proprietary base model weights.

Both models are loaded in float16 precision with low_cpu_mem_usage=True to minimize memory consumption during the subtraction. The function iterates over all parameters in the target model's state dictionary, asserts that each parameter name exists in the base model, and performs in-place subtraction (param.data -= base.state_dict()[name]). After computing the delta, it saves both the modified model weights and the target model's tokenizer to the specified output path.

The module also supports pushing the resulting delta weights directly to the HuggingFace Hub via the optional --hub-repo-id argument. When provided, both model.save_pretrained() and tokenizer.save_pretrained() are called with push_to_hub=True and the specified repository ID, automating the upload process.

Usage

Use this module when you have fine-tuned a model and want to distribute the weight differences rather than the full model. This is the standard workflow for releasing Vicuna model deltas. The output of this module is consumed by apply_delta to reconstruct the full model. Run it as python3 -m fastchat.model.make_delta with the required path arguments.

Code Reference

Source Location

Signature

def make_delta(base_model_path: str, target_model_path: str, delta_path: str) -> None:
    """Computes delta = target - base and saves the result."""
    ...

Import

from fastchat.model.make_delta import make_delta

I/O Contract

Inputs

Name Type Required Description
base_model_path str Yes Path to the base model weights directory or HuggingFace model ID (e.g., llama-13b)
target_model_path str Yes Path to the fine-tuned target model weights directory (e.g., vicuna-13b)
delta_path str Yes Output path where the computed delta weights will be saved
--hub-repo-id str No HuggingFace Hub repository ID for pushing the delta (e.g., lmsys/vicuna-13b-delta-v1.1)

Outputs

Name Type Description
delta_path (on disk) Directory Saved delta model directory containing the weight differences and tokenizer files
HuggingFace Hub (optional) Remote repository Delta weights pushed to the specified Hub repository when --hub-repo-id is provided

Usage Examples

# Command-line usage
# python3 -m fastchat.model.make_delta \
#     --base ~/model_weights/llama-13b \
#     --target ~/model_weights/vicuna-13b \
#     --delta ~/model_weights/vicuna-13b-delta \
#     --hub-repo-id lmsys/vicuna-13b-delta-v1.1

# Programmatic usage
from fastchat.model.make_delta import make_delta

make_delta(
    base_model_path="~/model_weights/llama-13b",
    target_model_path="~/model_weights/vicuna-13b",
    delta_path="~/model_weights/vicuna-13b-delta"
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment