Implementation:Huggingface Diffusers ModelMixin From Pretrained
| Knowledge Sources | |
|---|---|
| Domains | Diffusion_Models, Model_Loading, Transfer_Learning |
| Last Updated | 2026-02-13 21:00 GMT |
Overview
Concrete tool for loading pretrained diffusion model components from the Hugging Face Hub or local directories, provided by the ModelMixin base class in the Diffusers library.
Description
ModelMixin.from_pretrained is the class method that all Diffusers model classes inherit for loading pretrained weights. It handles downloading model files from the Hub, loading configuration, instantiating the model architecture, and loading the state dict. In the LoRA training workflow, it is called separately for each component (AutoencoderKL, UNet2DConditionModel, CLIPTextModel) with the subfolder argument pointing to the component's directory within the pipeline repository.
After loading, the training script freezes all parameters via requires_grad_(False), casts non-trainable models to the weight dtype (fp16/bf16), and moves them to the accelerator device. Only after LoRA adapters are injected into the UNet do any parameters become trainable again.
Usage
Use from_pretrained for loading individual model components when:
- Setting up a LoRA fine-tuning pipeline
- Loading specific model variants (fp16, bf16 weight files)
- Loading from a specific revision or branch
- Loading models from local directories or the Hub
Code Reference
Source Location
- Repository: diffusers
- File:
src/diffusers/models/modeling_utils.py - Lines: 836-1370
Signature
@classmethod
@validate_hf_hub_args
def from_pretrained(
cls,
pretrained_model_name_or_path: str | os.PathLike | None,
**kwargs,
) -> Self:
Import
from diffusers import AutoencoderKL, UNet2DConditionModel
from transformers import CLIPTextModel, CLIPTokenizer
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| pretrained_model_name_or_path | str or os.PathLike |
Yes | Model ID on the Hub (e.g., "stabilityai/stable-diffusion-2-1") or local path to a directory containing model weights.
|
| subfolder | str |
No | Subfolder within the model repository containing the component's weights (e.g., "unet", "vae", "text_encoder").
|
| revision | str |
No | Git revision (branch, tag, or commit hash) to load. Defaults to "main".
|
| variant | str |
No | Weight file variant suffix, e.g., "fp16" loads diffusion_pytorch_model.fp16.safetensors.
|
| torch_dtype | torch.dtype |
No | Override default dtype for model weights (e.g., torch.float16).
|
| cache_dir | str or os.PathLike |
No | Custom cache directory for downloaded models. |
| local_files_only | bool |
No | If True, only load from local cache without downloading.
|
| token | str or bool |
No | Authentication token for private model repositories. |
Outputs
| Name | Type | Description |
|---|---|---|
| model | Self (subclass of ModelMixin) |
Instantiated model with pretrained weights loaded, set to evaluation mode by default. |
Usage Examples
Basic Usage
from diffusers import AutoencoderKL, UNet2DConditionModel
from transformers import CLIPTextModel, CLIPTokenizer
from diffusers import DDPMScheduler
pretrained_model = "stable-diffusion-v1-5/stable-diffusion-v1-5"
# Load each component individually with subfolder
noise_scheduler = DDPMScheduler.from_pretrained(
pretrained_model, subfolder="scheduler"
)
tokenizer = CLIPTokenizer.from_pretrained(
pretrained_model, subfolder="tokenizer"
)
text_encoder = CLIPTextModel.from_pretrained(
pretrained_model, subfolder="text_encoder"
)
vae = AutoencoderKL.from_pretrained(
pretrained_model, subfolder="vae", variant="fp16"
)
unet = UNet2DConditionModel.from_pretrained(
pretrained_model, subfolder="unet", variant="fp16"
)
# Freeze all parameters for fine-tuning setup
unet.requires_grad_(False)
vae.requires_grad_(False)
text_encoder.requires_grad_(False)
# Cast non-trainable models to half precision
import torch
vae.to("cuda", dtype=torch.float16)
text_encoder.to("cuda", dtype=torch.float16)