Implementation:Axolotl ai cloud Axolotl Merge Fsdp Weights
| Knowledge Sources | |
|---|---|
| Domains | Distributed_Training, Model_Export |
| Last Updated | 2026-02-06 23:00 GMT |
Overview
Concrete tool for consolidating sharded FSDP checkpoints into a single model file provided by the Axolotl framework.
Description
The merge_fsdp_weights function reads a sharded FSDP checkpoint directory, uses torch.distributed.checkpoint to load the full state dict, instantiates the base model architecture, loads the consolidated weights, and saves the result as a standard HuggingFace model. It optionally saves the tokenizer alongside the model and can remove the sharded checkpoint directory after successful merging.
Usage
Invoke via CLI: axolotl merge-sharded-fsdp-weights config.yml --output_path ./merged_model. Called after FSDP training completes to produce a deployable model.
Code Reference
Source Location
- Repository: axolotl
- File: src/axolotl/cli/merge_sharded_fsdp_weights.py
- Lines: L108-166
Signature
def merge_fsdp_weights(
cfg: DictDefault,
output_path: str,
save_tokenizer: bool = True,
) -> None:
"""Merge sharded FSDP checkpoint into a single model file.
Args:
cfg: Configuration with base_model, output_dir (containing sharded checkpoint).
output_path: Directory to save the merged model.
save_tokenizer: Whether to save the tokenizer alongside the model.
"""
Import
from axolotl.cli.merge_sharded_fsdp_weights import merge_fsdp_weights
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| cfg | DictDefault | Yes | Config with base_model (model architecture), output_dir (sharded checkpoint location) |
| output_path | str | Yes | Directory to write the merged model |
| save_tokenizer | bool | No (default: True) | Whether to save tokenizer files alongside model |
Outputs
| Name | Type | Description |
|---|---|---|
| merged model | Directory | Single consolidated model at output_path in HuggingFace format |
Usage Examples
CLI Usage
# After FSDP training, merge sharded weights
axolotl merge-sharded-fsdp-weights examples/llama-3/fft-8b.yaml \
--output_path ./merged_model
Programmatic Usage
from axolotl.cli.config import load_cfg
from axolotl.cli.merge_sharded_fsdp_weights import merge_fsdp_weights
cfg = load_cfg("examples/llama-3/fft-8b.yaml")
merge_fsdp_weights(cfg, output_path="./merged_model", save_tokenizer=True)