Implementation:Huggingface Diffusers From Single File

Field	Value
Type	API Doc
Overview	The `from_single_file` class method on `FromOriginalModelMixin` that loads and converts original checkpoints to Diffusers models in a single call
Domains	Model Loading, Checkpoint Conversion
Workflow	Checkpoint_Conversion
Related Principle	Huggingface_Diffusers_Single_File_Loading
Source	`src/diffusers/loaders/single_file_model.py:L230-L529`
Last Updated	2026-02-13 00:00 GMT

Code Reference

FromOriginalModelMixin.from_single_file

Source: src/diffusers/loaders/single_file_model.py:L237-L529

class FromOriginalModelMixin:
    """Load pretrained weights saved in .ckpt or .safetensors format into a model."""

    @classmethod
    @validate_hf_hub_args
    def from_single_file(cls, pretrained_model_link_or_path_or_dict=None, **kwargs) -> Self:
        mapping_class_name = _get_single_file_loadable_mapping_class(cls)
        if mapping_class_name is None:
            raise ValueError(
                f"FromOriginalModelMixin is currently only compatible with "
                f"{', '.join(SINGLE_FILE_LOADABLE_CLASSES.keys())}"
            )

        config = kwargs.pop("config", None)
        original_config = kwargs.pop("original_config", None)
        torch_dtype = kwargs.pop("torch_dtype", None)
        quantization_config = kwargs.pop("quantization_config", None)
        low_cpu_mem_usage = kwargs.pop("low_cpu_mem_usage", _LOW_CPU_MEM_USAGE_DEFAULT)
        device_map = kwargs.pop("device_map", None)

        # 1. Load checkpoint
        if isinstance(pretrained_model_link_or_path_or_dict, dict):
            checkpoint = pretrained_model_link_or_path_or_dict
        else:
            checkpoint = load_single_file_checkpoint(pretrained_model_link_or_path_or_dict, ...)

        # 2. Set up quantization if requested
        if quantization_config is not None:
            hf_quantizer = DiffusersAutoQuantizer.from_config(quantization_config)
            torch_dtype = hf_quantizer.update_torch_dtype(torch_dtype)

        # 3. Get config (from original_config, user config, or auto-detected)
        mapping_functions = SINGLE_FILE_LOADABLE_CLASSES[mapping_class_name]
        checkpoint_mapping_fn = mapping_functions["checkpoint_mapping_fn"]

        if original_config is not None:
            config_mapping_fn = mapping_functions.get("config_mapping_fn")
            diffusers_model_config = config_mapping_fn(original_config=original_config, ...)
        else:
            if config is None:
                config = fetch_diffusers_config(checkpoint)
                default_pretrained = config["pretrained_model_name_or_path"]
            diffusers_model_config = cls.load_config(pretrained_model_name_or_path=default_pretrained, ...)

        # 4. Initialize model (empty weights for low memory)
        ctx = init_empty_weights if low_cpu_mem_usage else nullcontext
        with ctx():
            model = cls.from_config(diffusers_model_config)

        # 5. Convert checkpoint if needed
        if _should_convert_state_dict_to_diffusers(model.state_dict(), checkpoint):
            diffusers_format_checkpoint = checkpoint_mapping_fn(
                config=diffusers_model_config, checkpoint=checkpoint, ...
            )
        else:
            diffusers_format_checkpoint = checkpoint

        # 6. Load weights into model
        if low_cpu_mem_usage:
            load_model_dict_into_meta(model, diffusers_format_checkpoint, dtype=torch_dtype, ...)
        else:
            model.load_state_dict(diffusers_format_checkpoint, strict=False)

        # 7. Post-processing
        if torch_dtype is not None:
            model.to(torch_dtype)
        model.eval()

        return model

Import

# The mixin is inherited by model classes; use them directly:
from diffusers import WanTransformer3DModel, FluxTransformer2DModel, AutoencoderKLWan
from diffusers import StableDiffusionPipeline  # Pipeline-level from_single_file is different

Key Parameters

Parameter	Type	Description	Default
`pretrained_model_link_or_path_or_dict`	dict	URL, local path to .safetensors/.ckpt, or pre-loaded state dict	(required)
`config`	None	Repo ID or local path for Diffusers config	Auto-detected
`original_config`	dict \| None	Original training YAML config	`None`
`torch_dtype`	None	Target dtype for model weights	`None`
`quantization_config`	None	Quantization settings (BitsAndBytes, GGUF, etc.)	`None`
`low_cpu_mem_usage`	`bool`	Use empty weights initialization	`True` (if accelerate available)
`device`	None	Target device for loading	`None`
`device_map`	dict \| None	Device placement strategy	`None`
`subfolder`	None	Config subfolder	Auto-detected from registry
`force_download`	`bool`	Re-download even if cached	`False`
`disable_mmap`	`bool`	Disable memory mapping for safetensors	`False`

I/O Contract

Inputs

pretrained_model_link_or_path_or_dict: One of:
- A HuggingFace Hub URL (e.g., "https://huggingface.co/repo/blob/main/model.safetensors")
- A local file path (e.g., "/path/to/model.safetensors")
- A pre-loaded state dict (dict[str, torch.Tensor])

Outputs

Model instance of the calling class (e.g., WanTransformer3DModel), in eval mode, with weights loaded and optionally cast to the specified dtype.

Execution Flow

Validate that the calling class is in SINGLE_FILE_LOADABLE_CLASSES
Load checkpoint from file/URL/dict
Set up quantization if requested
Determine model config (auto-detect, user-specified, or original config)
Initialize empty model from config
Check if conversion is needed (compare keys)
Run conversion function if needed
Load converted weights into model (meta device or standard)
Apply quantization post-processing if needed
Cast to target dtype and set eval mode
Apply device_map if specified

External Dependencies

safetensors (for .safetensors file loading)
accelerate (for init_empty_weights, dispatch_model, cpu_offload_with_hook)
huggingface_hub (for file downloading and Hub API)

Usage Examples

Loading a Wan Transformer from Single File

import torch
from diffusers import WanTransformer3DModel

# From a local .safetensors file
model = WanTransformer3DModel.from_single_file(
    "wan-14b-t2v.safetensors",
    torch_dtype=torch.bfloat16,
)
model.to("cuda")

Loading from a HuggingFace URL

from diffusers import FluxTransformer2DModel

model = FluxTransformer2DModel.from_single_file(
    "https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/flux1-dev.safetensors",
    torch_dtype=torch.bfloat16,
)

Loading with Quantization

from diffusers import FluxTransformer2DModel, BitsAndBytesConfig

quantization_config = BitsAndBytesConfig(load_in_4bit=True)
model = FluxTransformer2DModel.from_single_file(
    "flux1-dev.safetensors",
    quantization_config=quantization_config,
)

Loading with Custom Config

from diffusers import WanTransformer3DModel

model = WanTransformer3DModel.from_single_file(
    "custom_wan_model.safetensors",
    config="my-org/my-wan-config",  # HuggingFace repo with config.json
    torch_dtype=torch.bfloat16,
)

Related Pages

Huggingface_Diffusers_Single_File_Loading (principle for this implementation) - Theory of on-the-fly conversion
Huggingface_Diffusers_Infer_Model_Type (used internally) - Identifies checkpoint type for config fetching
Huggingface_Diffusers_Single_File_Loadable_Classes (used internally) - Registry for conversion dispatch
Huggingface_Diffusers_Convert_Checkpoint_To_Diffusers (used internally) - Actual weight conversion
Huggingface_Diffusers_Save_Pretrained_And_Push (next step) - Saving the loaded model in Diffusers format

Principle:Huggingface_Diffusers_Single_File_Loading

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment