Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Huggingface Diffusers From Single File

From Leeroopedia
Field Value
Type API Doc
Overview The from_single_file class method on FromOriginalModelMixin that loads and converts original checkpoints to Diffusers models in a single call
Domains Model Loading, Checkpoint Conversion
Workflow Checkpoint_Conversion
Related Principle Huggingface_Diffusers_Single_File_Loading
Source src/diffusers/loaders/single_file_model.py:L230-L529
Last Updated 2026-02-13 00:00 GMT

Code Reference

FromOriginalModelMixin.from_single_file

Source: src/diffusers/loaders/single_file_model.py:L237-L529

class FromOriginalModelMixin:
    """Load pretrained weights saved in .ckpt or .safetensors format into a model."""

    @classmethod
    @validate_hf_hub_args
    def from_single_file(cls, pretrained_model_link_or_path_or_dict=None, **kwargs) -> Self:
        mapping_class_name = _get_single_file_loadable_mapping_class(cls)
        if mapping_class_name is None:
            raise ValueError(
                f"FromOriginalModelMixin is currently only compatible with "
                f"{', '.join(SINGLE_FILE_LOADABLE_CLASSES.keys())}"
            )

        config = kwargs.pop("config", None)
        original_config = kwargs.pop("original_config", None)
        torch_dtype = kwargs.pop("torch_dtype", None)
        quantization_config = kwargs.pop("quantization_config", None)
        low_cpu_mem_usage = kwargs.pop("low_cpu_mem_usage", _LOW_CPU_MEM_USAGE_DEFAULT)
        device_map = kwargs.pop("device_map", None)

        # 1. Load checkpoint
        if isinstance(pretrained_model_link_or_path_or_dict, dict):
            checkpoint = pretrained_model_link_or_path_or_dict
        else:
            checkpoint = load_single_file_checkpoint(pretrained_model_link_or_path_or_dict, ...)

        # 2. Set up quantization if requested
        if quantization_config is not None:
            hf_quantizer = DiffusersAutoQuantizer.from_config(quantization_config)
            torch_dtype = hf_quantizer.update_torch_dtype(torch_dtype)

        # 3. Get config (from original_config, user config, or auto-detected)
        mapping_functions = SINGLE_FILE_LOADABLE_CLASSES[mapping_class_name]
        checkpoint_mapping_fn = mapping_functions["checkpoint_mapping_fn"]

        if original_config is not None:
            config_mapping_fn = mapping_functions.get("config_mapping_fn")
            diffusers_model_config = config_mapping_fn(original_config=original_config, ...)
        else:
            if config is None:
                config = fetch_diffusers_config(checkpoint)
                default_pretrained = config["pretrained_model_name_or_path"]
            diffusers_model_config = cls.load_config(pretrained_model_name_or_path=default_pretrained, ...)

        # 4. Initialize model (empty weights for low memory)
        ctx = init_empty_weights if low_cpu_mem_usage else nullcontext
        with ctx():
            model = cls.from_config(diffusers_model_config)

        # 5. Convert checkpoint if needed
        if _should_convert_state_dict_to_diffusers(model.state_dict(), checkpoint):
            diffusers_format_checkpoint = checkpoint_mapping_fn(
                config=diffusers_model_config, checkpoint=checkpoint, ...
            )
        else:
            diffusers_format_checkpoint = checkpoint

        # 6. Load weights into model
        if low_cpu_mem_usage:
            load_model_dict_into_meta(model, diffusers_format_checkpoint, dtype=torch_dtype, ...)
        else:
            model.load_state_dict(diffusers_format_checkpoint, strict=False)

        # 7. Post-processing
        if torch_dtype is not None:
            model.to(torch_dtype)
        model.eval()

        return model

Import

# The mixin is inherited by model classes; use them directly:
from diffusers import WanTransformer3DModel, FluxTransformer2DModel, AutoencoderKLWan
from diffusers import StableDiffusionPipeline  # Pipeline-level from_single_file is different

Key Parameters

Parameter Type Description Default
pretrained_model_link_or_path_or_dict dict URL, local path to .safetensors/.ckpt, or pre-loaded state dict (required)
config None Repo ID or local path for Diffusers config Auto-detected
original_config dict | None Original training YAML config None
torch_dtype None Target dtype for model weights None
quantization_config None Quantization settings (BitsAndBytes, GGUF, etc.) None
low_cpu_mem_usage bool Use empty weights initialization True (if accelerate available)
device None Target device for loading None
device_map dict | None Device placement strategy None
subfolder None Config subfolder Auto-detected from registry
force_download bool Re-download even if cached False
disable_mmap bool Disable memory mapping for safetensors False

I/O Contract

Inputs

Outputs

  • Model instance of the calling class (e.g., WanTransformer3DModel), in eval mode, with weights loaded and optionally cast to the specified dtype.

Execution Flow

  1. Validate that the calling class is in SINGLE_FILE_LOADABLE_CLASSES
  2. Load checkpoint from file/URL/dict
  3. Set up quantization if requested
  4. Determine model config (auto-detect, user-specified, or original config)
  5. Initialize empty model from config
  6. Check if conversion is needed (compare keys)
  7. Run conversion function if needed
  8. Load converted weights into model (meta device or standard)
  9. Apply quantization post-processing if needed
  10. Cast to target dtype and set eval mode
  11. Apply device_map if specified

External Dependencies

  • safetensors (for .safetensors file loading)
  • accelerate (for init_empty_weights, dispatch_model, cpu_offload_with_hook)
  • huggingface_hub (for file downloading and Hub API)

Usage Examples

Loading a Wan Transformer from Single File

import torch
from diffusers import WanTransformer3DModel

# From a local .safetensors file
model = WanTransformer3DModel.from_single_file(
    "wan-14b-t2v.safetensors",
    torch_dtype=torch.bfloat16,
)
model.to("cuda")

Loading from a HuggingFace URL

from diffusers import FluxTransformer2DModel

model = FluxTransformer2DModel.from_single_file(
    "https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/flux1-dev.safetensors",
    torch_dtype=torch.bfloat16,
)

Loading with Quantization

from diffusers import FluxTransformer2DModel, BitsAndBytesConfig

quantization_config = BitsAndBytesConfig(load_in_4bit=True)
model = FluxTransformer2DModel.from_single_file(
    "flux1-dev.safetensors",
    quantization_config=quantization_config,
)

Loading with Custom Config

from diffusers import WanTransformer3DModel

model = WanTransformer3DModel.from_single_file(
    "custom_wan_model.safetensors",
    config="my-org/my-wan-config",  # HuggingFace repo with config.json
    torch_dtype=torch.bfloat16,
)

Related Pages

Principle:Huggingface_Diffusers_Single_File_Loading

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment