Implementation:Axolotl ai cloud Axolotl Merge Fsdp Weights

Knowledge Sources	Axolotl PyTorch Distributed Checkpoint
Domains	Distributed_Training, Model_Export
Last Updated	2026-02-06 23:00 GMT

Overview

Concrete tool for consolidating sharded FSDP checkpoints into a single model file provided by the Axolotl framework.

Description

The merge_fsdp_weights function reads a sharded FSDP checkpoint directory, uses torch.distributed.checkpoint to load the full state dict, instantiates the base model architecture, loads the consolidated weights, and saves the result as a standard HuggingFace model. It optionally saves the tokenizer alongside the model and can remove the sharded checkpoint directory after successful merging.

Usage

Invoke via CLI: axolotl merge-sharded-fsdp-weights config.yml --output_path ./merged_model. Called after FSDP training completes to produce a deployable model.

Code Reference

Source Location

Repository: axolotl
File: src/axolotl/cli/merge_sharded_fsdp_weights.py
Lines: L108-166

Signature

def merge_fsdp_weights(
    cfg: DictDefault,
    output_path: str,
    save_tokenizer: bool = True,
) -> None:
    """Merge sharded FSDP checkpoint into a single model file.

    Args:
        cfg: Configuration with base_model, output_dir (containing sharded checkpoint).
        output_path: Directory to save the merged model.
        save_tokenizer: Whether to save the tokenizer alongside the model.
    """

Import

from axolotl.cli.merge_sharded_fsdp_weights import merge_fsdp_weights

I/O Contract

Inputs

Name	Type	Required	Description
cfg	DictDefault	Yes	Config with base_model (model architecture), output_dir (sharded checkpoint location)
output_path	str	Yes	Directory to write the merged model
save_tokenizer	bool	No (default: True)	Whether to save tokenizer files alongside model

Outputs

Name	Type	Description
merged model	Directory	Single consolidated model at output_path in HuggingFace format

Usage Examples

CLI Usage

# After FSDP training, merge sharded weights
axolotl merge-sharded-fsdp-weights examples/llama-3/fft-8b.yaml \
    --output_path ./merged_model

Programmatic Usage

from axolotl.cli.config import load_cfg
from axolotl.cli.merge_sharded_fsdp_weights import merge_fsdp_weights

cfg = load_cfg("examples/llama-3/fft-8b.yaml")
merge_fsdp_weights(cfg, output_path="./merged_model", save_tokenizer=True)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment