Implementation:Huggingface Diffusers From Single File
Appearance
| Field | Value |
|---|---|
| Type | API Doc |
| Overview | The from_single_file class method on FromOriginalModelMixin that loads and converts original checkpoints to Diffusers models in a single call
|
| Domains | Model Loading, Checkpoint Conversion |
| Workflow | Checkpoint_Conversion |
| Related Principle | Huggingface_Diffusers_Single_File_Loading |
| Source | src/diffusers/loaders/single_file_model.py:L230-L529
|
| Last Updated | 2026-02-13 00:00 GMT |
Code Reference
FromOriginalModelMixin.from_single_file
Source: src/diffusers/loaders/single_file_model.py:L237-L529
class FromOriginalModelMixin:
"""Load pretrained weights saved in .ckpt or .safetensors format into a model."""
@classmethod
@validate_hf_hub_args
def from_single_file(cls, pretrained_model_link_or_path_or_dict=None, **kwargs) -> Self:
mapping_class_name = _get_single_file_loadable_mapping_class(cls)
if mapping_class_name is None:
raise ValueError(
f"FromOriginalModelMixin is currently only compatible with "
f"{', '.join(SINGLE_FILE_LOADABLE_CLASSES.keys())}"
)
config = kwargs.pop("config", None)
original_config = kwargs.pop("original_config", None)
torch_dtype = kwargs.pop("torch_dtype", None)
quantization_config = kwargs.pop("quantization_config", None)
low_cpu_mem_usage = kwargs.pop("low_cpu_mem_usage", _LOW_CPU_MEM_USAGE_DEFAULT)
device_map = kwargs.pop("device_map", None)
# 1. Load checkpoint
if isinstance(pretrained_model_link_or_path_or_dict, dict):
checkpoint = pretrained_model_link_or_path_or_dict
else:
checkpoint = load_single_file_checkpoint(pretrained_model_link_or_path_or_dict, ...)
# 2. Set up quantization if requested
if quantization_config is not None:
hf_quantizer = DiffusersAutoQuantizer.from_config(quantization_config)
torch_dtype = hf_quantizer.update_torch_dtype(torch_dtype)
# 3. Get config (from original_config, user config, or auto-detected)
mapping_functions = SINGLE_FILE_LOADABLE_CLASSES[mapping_class_name]
checkpoint_mapping_fn = mapping_functions["checkpoint_mapping_fn"]
if original_config is not None:
config_mapping_fn = mapping_functions.get("config_mapping_fn")
diffusers_model_config = config_mapping_fn(original_config=original_config, ...)
else:
if config is None:
config = fetch_diffusers_config(checkpoint)
default_pretrained = config["pretrained_model_name_or_path"]
diffusers_model_config = cls.load_config(pretrained_model_name_or_path=default_pretrained, ...)
# 4. Initialize model (empty weights for low memory)
ctx = init_empty_weights if low_cpu_mem_usage else nullcontext
with ctx():
model = cls.from_config(diffusers_model_config)
# 5. Convert checkpoint if needed
if _should_convert_state_dict_to_diffusers(model.state_dict(), checkpoint):
diffusers_format_checkpoint = checkpoint_mapping_fn(
config=diffusers_model_config, checkpoint=checkpoint, ...
)
else:
diffusers_format_checkpoint = checkpoint
# 6. Load weights into model
if low_cpu_mem_usage:
load_model_dict_into_meta(model, diffusers_format_checkpoint, dtype=torch_dtype, ...)
else:
model.load_state_dict(diffusers_format_checkpoint, strict=False)
# 7. Post-processing
if torch_dtype is not None:
model.to(torch_dtype)
model.eval()
return model
Import
# The mixin is inherited by model classes; use them directly:
from diffusers import WanTransformer3DModel, FluxTransformer2DModel, AutoencoderKLWan
from diffusers import StableDiffusionPipeline # Pipeline-level from_single_file is different
Key Parameters
| Parameter | Type | Description | Default |
|---|---|---|---|
pretrained_model_link_or_path_or_dict |
dict | URL, local path to .safetensors/.ckpt, or pre-loaded state dict | (required) |
config |
None | Repo ID or local path for Diffusers config | Auto-detected |
original_config |
dict | None | Original training YAML config | None
|
torch_dtype |
None | Target dtype for model weights | None
|
quantization_config |
None | Quantization settings (BitsAndBytes, GGUF, etc.) | None
|
low_cpu_mem_usage |
bool |
Use empty weights initialization | True (if accelerate available)
|
device |
None | Target device for loading | None
|
device_map |
dict | None | Device placement strategy | None
|
subfolder |
None | Config subfolder | Auto-detected from registry |
force_download |
bool |
Re-download even if cached | False
|
disable_mmap |
bool |
Disable memory mapping for safetensors | False
|
I/O Contract
Inputs
- pretrained_model_link_or_path_or_dict: One of:
- A HuggingFace Hub URL (e.g.,
"https://huggingface.co/repo/blob/main/model.safetensors") - A local file path (e.g.,
"/path/to/model.safetensors") - A pre-loaded state dict (
dict[str, torch.Tensor])
- A HuggingFace Hub URL (e.g.,
Outputs
- Model instance of the calling class (e.g.,
WanTransformer3DModel), in eval mode, with weights loaded and optionally cast to the specified dtype.
Execution Flow
- Validate that the calling class is in
SINGLE_FILE_LOADABLE_CLASSES - Load checkpoint from file/URL/dict
- Set up quantization if requested
- Determine model config (auto-detect, user-specified, or original config)
- Initialize empty model from config
- Check if conversion is needed (compare keys)
- Run conversion function if needed
- Load converted weights into model (meta device or standard)
- Apply quantization post-processing if needed
- Cast to target dtype and set eval mode
- Apply device_map if specified
External Dependencies
safetensors(for .safetensors file loading)accelerate(forinit_empty_weights,dispatch_model,cpu_offload_with_hook)huggingface_hub(for file downloading and Hub API)
Usage Examples
Loading a Wan Transformer from Single File
import torch
from diffusers import WanTransformer3DModel
# From a local .safetensors file
model = WanTransformer3DModel.from_single_file(
"wan-14b-t2v.safetensors",
torch_dtype=torch.bfloat16,
)
model.to("cuda")
Loading from a HuggingFace URL
from diffusers import FluxTransformer2DModel
model = FluxTransformer2DModel.from_single_file(
"https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/flux1-dev.safetensors",
torch_dtype=torch.bfloat16,
)
Loading with Quantization
from diffusers import FluxTransformer2DModel, BitsAndBytesConfig
quantization_config = BitsAndBytesConfig(load_in_4bit=True)
model = FluxTransformer2DModel.from_single_file(
"flux1-dev.safetensors",
quantization_config=quantization_config,
)
Loading with Custom Config
from diffusers import WanTransformer3DModel
model = WanTransformer3DModel.from_single_file(
"custom_wan_model.safetensors",
config="my-org/my-wan-config", # HuggingFace repo with config.json
torch_dtype=torch.bfloat16,
)
Related Pages
- Huggingface_Diffusers_Single_File_Loading (principle for this implementation) - Theory of on-the-fly conversion
- Huggingface_Diffusers_Infer_Model_Type (used internally) - Identifies checkpoint type for config fetching
- Huggingface_Diffusers_Single_File_Loadable_Classes (used internally) - Registry for conversion dispatch
- Huggingface_Diffusers_Convert_Checkpoint_To_Diffusers (used internally) - Actual weight conversion
- Huggingface_Diffusers_Save_Pretrained_And_Push (next step) - Saving the loaded model in Diffusers format
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment