Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Axolotl ai cloud Axolotl Merge Fsdp Weights

From Leeroopedia


Knowledge Sources
Domains Distributed_Training, Model_Export
Last Updated 2026-02-06 23:00 GMT

Overview

Concrete tool for consolidating sharded FSDP checkpoints into a single model file provided by the Axolotl framework.

Description

The merge_fsdp_weights function reads a sharded FSDP checkpoint directory, uses torch.distributed.checkpoint to load the full state dict, instantiates the base model architecture, loads the consolidated weights, and saves the result as a standard HuggingFace model. It optionally saves the tokenizer alongside the model and can remove the sharded checkpoint directory after successful merging.

Usage

Invoke via CLI: axolotl merge-sharded-fsdp-weights config.yml --output_path ./merged_model. Called after FSDP training completes to produce a deployable model.

Code Reference

Source Location

  • Repository: axolotl
  • File: src/axolotl/cli/merge_sharded_fsdp_weights.py
  • Lines: L108-166

Signature

def merge_fsdp_weights(
    cfg: DictDefault,
    output_path: str,
    save_tokenizer: bool = True,
) -> None:
    """Merge sharded FSDP checkpoint into a single model file.

    Args:
        cfg: Configuration with base_model, output_dir (containing sharded checkpoint).
        output_path: Directory to save the merged model.
        save_tokenizer: Whether to save the tokenizer alongside the model.
    """

Import

from axolotl.cli.merge_sharded_fsdp_weights import merge_fsdp_weights

I/O Contract

Inputs

Name Type Required Description
cfg DictDefault Yes Config with base_model (model architecture), output_dir (sharded checkpoint location)
output_path str Yes Directory to write the merged model
save_tokenizer bool No (default: True) Whether to save tokenizer files alongside model

Outputs

Name Type Description
merged model Directory Single consolidated model at output_path in HuggingFace format

Usage Examples

CLI Usage

# After FSDP training, merge sharded weights
axolotl merge-sharded-fsdp-weights examples/llama-3/fft-8b.yaml \
    --output_path ./merged_model

Programmatic Usage

from axolotl.cli.config import load_cfg
from axolotl.cli.merge_sharded_fsdp_weights import merge_fsdp_weights

cfg = load_cfg("examples/llama-3/fft-8b.yaml")
merge_fsdp_weights(cfg, output_path="./merged_model", save_tokenizer=True)

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment