Principle:Axolotl ai cloud Axolotl Model Saving
| Knowledge Sources | |
|---|---|
| Domains | Model_Persistence, Training_Pipeline |
| Last Updated | 2026-02-06 23:00 GMT |
Overview
A model persistence pattern that saves trained model weights, adapter parameters, and associated configuration to disk in a format compatible with the HuggingFace ecosystem.
Description
Model Saving is the final step of the training pipeline that persists trained weights to disk. The complexity arises from the variety of training configurations: LoRA adapter saving (only adapter weights), full model saving, FSDP sharded saving, DeepSpeed ZeRO checkpoint saving, and safe serialization formats. Each requires different handling to produce a correct, loadable artifact.
Key considerations include:
- Adapter-only saving: For LoRA/QLoRA, only the small adapter weights are saved (not the frozen base model)
- Distributed saving: FSDP and DeepSpeed require coordinated saving across processes
- Safe serialization: Using SafeTensors format for security and performance
- Checkpoint management: Saving intermediate checkpoints during training for recovery
Usage
Use this principle at the end of training or at regular intervals during training to:
- Persist trained model/adapter weights for inference or continued training
- Save in HuggingFace-compatible format for easy sharing and deployment
- Handle distributed training scenarios (FSDP, DeepSpeed) correctly
Theoretical Basis
Model saving follows the serialization pattern with distributed coordination:
Pseudo-code:
# Abstract model saving algorithm
if is_adapter_training:
model.save_pretrained(output_dir) # Saves only adapter weights
tokenizer.save_pretrained(output_dir)
elif is_fsdp:
gather_full_state_dict(model) # Gather from all ranks
if is_main_process:
model.save_pretrained(output_dir, safe_serialization=True)
elif is_deepspeed:
deepspeed_save_checkpoint(model, output_dir)
else:
model.save_pretrained(output_dir, safe_serialization=True)
tokenizer.save_pretrained(output_dir)