Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Axolotl ai cloud Axolotl Model Saving

From Leeroopedia


Knowledge Sources
Domains Model_Persistence, Training_Pipeline
Last Updated 2026-02-06 23:00 GMT

Overview

A model persistence pattern that saves trained model weights, adapter parameters, and associated configuration to disk in a format compatible with the HuggingFace ecosystem.

Description

Model Saving is the final step of the training pipeline that persists trained weights to disk. The complexity arises from the variety of training configurations: LoRA adapter saving (only adapter weights), full model saving, FSDP sharded saving, DeepSpeed ZeRO checkpoint saving, and safe serialization formats. Each requires different handling to produce a correct, loadable artifact.

Key considerations include:

  • Adapter-only saving: For LoRA/QLoRA, only the small adapter weights are saved (not the frozen base model)
  • Distributed saving: FSDP and DeepSpeed require coordinated saving across processes
  • Safe serialization: Using SafeTensors format for security and performance
  • Checkpoint management: Saving intermediate checkpoints during training for recovery

Usage

Use this principle at the end of training or at regular intervals during training to:

  • Persist trained model/adapter weights for inference or continued training
  • Save in HuggingFace-compatible format for easy sharing and deployment
  • Handle distributed training scenarios (FSDP, DeepSpeed) correctly

Theoretical Basis

Model saving follows the serialization pattern with distributed coordination:

Pseudo-code:

# Abstract model saving algorithm
if is_adapter_training:
    model.save_pretrained(output_dir)  # Saves only adapter weights
    tokenizer.save_pretrained(output_dir)
elif is_fsdp:
    gather_full_state_dict(model)  # Gather from all ranks
    if is_main_process:
        model.save_pretrained(output_dir, safe_serialization=True)
elif is_deepspeed:
    deepspeed_save_checkpoint(model, output_dir)
else:
    model.save_pretrained(output_dir, safe_serialization=True)
    tokenizer.save_pretrained(output_dir)

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment