Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:AUTOMATIC1111 Stable diffusion webui DDPM Edit Model

From Leeroopedia


Knowledge Sources
Domains Diffusion Models, Image Editing, Latent Diffusion
Last Updated 2025-05-15 00:00 GMT

Overview

Implements the Denoising Diffusion Probabilistic Model (DDPM) and Latent Diffusion Model classes modified for InstructPix2Pix-style image editing, providing the core diffusion training and inference pipeline for instruction-based image-to-image generation.

Description

This module defines several key classes for the diffusion pipeline:

  • DDPM: A PyTorch Lightning module that manages noise schedules (beta/alpha cumulative products), EMA (Exponential Moving Average) weights, and forward/reverse diffusion processes. It supports both epsilon-prediction and x0-prediction parameterizations.
  • LatentDiffusion: Extends DDPM to operate in VAE latent space with cross-attention conditioning. It handles first-stage encoding/decoding via a VAE or VQ-VAE, cond-stage processing through CLIP or other text encoders, and supports multiple conditioning keys (concat, crossattn, adm).
  • DiffusionWrapper: Routes conditioning keys to the underlying UNet model, handling concatenation of image conditions, cross-attention text conditions, and ADM vector conditions.
  • Layout2ImgDiffusion: A specialized variant for layout-to-image generation.

The module is modified from the original CompVis stable-diffusion implementation by the InstructPix2Pix authors, adding additional input channels to the first UNet layer for conditioning on an input image.

Usage

Use this module when working with InstructPix2Pix-style image editing workflows. The model is loaded when the user selects an InstructPix2Pix checkpoint, providing the core diffusion backbone for instruction-guided image modification.

Code Reference

Source Location

Signature

class DDPM(pl.LightningModule):
    def __init__(self, unet_config, timesteps=1000, beta_schedule="linear",
                 loss_type="l2", ckpt_path=None, ignore_keys=None, ...):
    def forward(self, x, *args, **kwargs):

class LatentDiffusion(DDPM):
    def __init__(self, first_stage_config, cond_stage_config,
                 num_timesteps_cond=None, cond_stage_key="image", ...):

class DiffusionWrapper(pl.LightningModule):
    def __init__(self, diff_model_config, conditioning_key):
    def forward(self, x, t, c_concat=None, c_crossattn=None, c_adm=None):

Import

from modules.models.diffusion.ddpm_edit import DDPM, LatentDiffusion, DiffusionWrapper

I/O Contract

Inputs

Name Type Required Description
x torch.Tensor Yes Input image tensor or latent representation with shape (N, C, H, W)
t torch.Tensor Yes Diffusion timestep indices as a long tensor with shape (N,)
c_concat list[torch.Tensor] No Concatenated image conditions for the UNet input
c_crossattn list[torch.Tensor] No Cross-attention text conditioning tensors
c_adm torch.Tensor No ADM (Adaptive) vector conditioning

Outputs

Name Type Description
loss torch.Tensor Training loss combining simple loss, VLB loss, and optional ELBO weighting
loss_dict dict Dictionary of named loss components for logging

Usage Examples

from modules.models.diffusion.ddpm_edit import LatentDiffusion

# The model is typically instantiated from a config and checkpoint
# by the model loading infrastructure, not directly by user code.

# Example of how the DiffusionWrapper routes conditioning:
# wrapper = DiffusionWrapper(unet_config, conditioning_key="hybrid")
# output = wrapper(x_noisy, timesteps, c_concat=[img_cond], c_crossattn=[text_cond])

# The LatentDiffusion class handles encoding/decoding:
# z = model.get_first_stage_encoding(model.encode_first_stage(image))
# c = model.get_learned_conditioning(prompt)
# noise_pred = model.apply_model(z_noisy, t, cond=c)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment