Implementation:Huggingface Transformers Save Pretrained For Adapters

Knowledge Sources	Transformers PEFT Docs Transformers Docs
Domains	Parameter_Efficient_Fine_Tuning, NLP, Model_Serialization
Last Updated	2026-02-13 00:00 GMT

Overview

Concrete tool for saving adapter weights and configuration to disk in the PEFT-compatible format, provided by the PreTrainedModel.save_pretrained method with automatic PEFT detection.

Description

model.save_pretrained() is the standard Transformers method for persisting model weights. When the model has PEFT adapters attached (detected via the _hf_peft_config_loaded flag), the method automatically switches to an adapter-only saving mode.

The PEFT-aware saving logic (in src/transformers/modeling_utils.py, lines 3247-3272) performs:

Adapter detection: Checks _hf_peft_config_loaded to determine if adapters are present
State dict extraction: Calls model.get_adapter_state_dict(state_dict=state_dict) to extract only the active adapter's parameters. This method (from PeftAdapterMixin) delegates to PEFT's get_peft_model_state_dict.
Key prefixing: When save_peft_format=True (the default), all state dict keys are prefixed with base_model.model. for PEFT library compatibility
Multi-adapter validation: Raises an error if multiple adapters are active simultaneously (only one adapter can be saved at a time)
Config saving: Saves the active adapter's PeftConfig as adapter_config.json via current_peft_config.save_pretrained(save_directory)
Skips base config: When adapters are detected, the base model config is not saved to the directory (since the adapter config references the base model)
Weight serialization: Saves the adapter state dict as adapter_model.safetensors (or .bin)

The method also handles generation config saving (independent of PEFT) and supports push_to_hub for direct upload.

Usage

Use save_pretrained for adapters when you want to:

Save trained adapter weights after fine-tuning
Create PEFT-compatible adapter checkpoints that can be loaded with model.load_adapter()
Push adapter weights to the Hugging Face Hub
Save specific adapters by first calling model.set_adapter(name)

Code Reference

Source Location

Repository: transformers
File: src/transformers/modeling_utils.py (lines 3125-3290, PEFT logic at 3247-3272)
Supporting file: src/transformers/integrations/peft.py (lines 783-810 for get_adapter_state_dict)

Signature

def save_pretrained(
    self,
    save_directory: str | os.PathLike,
    is_main_process: bool = True,
    state_dict: dict | None = None,
    push_to_hub: bool = False,
    max_shard_size: int | str = "50GB",
    variant: str | None = None,
    token: str | bool | None = None,
    save_peft_format: bool = True,
    save_original_format: bool = True,
    **kwargs,
)

Import

# save_pretrained is a method on PreTrainedModel
from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("model-name")
# ... add and train adapter ...
model.save_pretrained("./adapter-output")

I/O Contract

Inputs

Name	Type	Required	Description
save_directory	`str` or `os.PathLike`	Yes	Directory path where the adapter weights and config will be saved. Created if it does not exist.
is_main_process	`bool`	No	Whether this is the main process in distributed training. Only the main process saves files. Default: `True`.
state_dict	`dict` or `None`	No	A pre-computed state dict. If `None`, the method extracts the adapter state dict automatically. Default: `None`.
push_to_hub	`bool`	No	Whether to push the saved adapter to the Hugging Face Hub. Default: `False`.
save_peft_format	`bool`	No	Whether to prepend `base_model.model.` to state dict keys for PEFT compatibility. Default: `True`.
token	`str` or `bool` or `None`	No	Authentication token for Hub uploads. Default: `None`.
max_shard_size	`int` or `str`	No	Maximum file size for sharded checkpoints. Default: `"50GB"`.

Outputs

Name	Type	Description
(files on disk)	Files	The following files are written to `save_directory`: `adapter_config.json` -- The PEFT adapter configuration `adapter_model.safetensors` -- The adapter weights in safetensors format Optionally `generation_config.json` if the model supports generation

Usage Examples

Basic Usage: Save Trained Adapter

from transformers import AutoModelForCausalLM
from peft import LoraConfig

model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-hf")
model.add_adapter(LoraConfig(r=16, target_modules=["q_proj", "v_proj"], task_type="CAUSAL_LM"))

# ... train the adapter ...

# Save only adapter weights and config
model.save_pretrained("./my-lora-adapter")
# Creates:
#   ./my-lora-adapter/adapter_config.json
#   ./my-lora-adapter/adapter_model.safetensors

Save and Push to Hub

model.save_pretrained(
    "./my-lora-adapter",
    push_to_hub=True,
    repo_id="my-org/llama-2-lora-adapter",
    token="hf_...",
)

Save Specific Adapter from Multi-Adapter Model

# Save the summarization adapter
model.set_adapter("summarization")
model.save_pretrained("./summarization-adapter")

# Save the translation adapter
model.set_adapter("translation")
model.save_pretrained("./translation-adapter")

Related Pages

Implements Principle

Principle:Huggingface_Transformers_Adapter_Weight_Saving

Requires Environment

Environment:Huggingface_Transformers_PEFT_Adapter_Env

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment