Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:AUTOMATIC1111 Stable diffusion webui SD3 Inferencer

From Leeroopedia


Knowledge Sources
Domains Stable_Diffusion_3, Inference
Last Updated 2025-05-15 00:00 GMT

Overview

Implements the Stable Diffusion 3 inference pipeline, wrapping the SD3 diffusion model, VAE, and text encoders into a unified interface compatible with the WebUI's model handling.

Description

The SD3 Inferencer module provides two classes: SD3Denoiser and SD3Inferencer. The SD3Denoiser extends k-diffusion's DiscreteSchedule to wrap the SD3 model's apply_model method for compatibility with k-diffusion sampling. The SD3Inferencer is a torch.nn.Module that assembles the complete SD3 pipeline: it initializes the base diffusion model with a configurable shift parameter, sets up the SD3 VAE for encoding/decoding with a 16-channel latent space, and configures the SD3 text conditioning system (SD3Cond). It computes alphas_cumprod from the model's sigma schedule for compatibility with existing sampling infrastructure. The class provides methods for first-stage encoding/decoding (with latent format processing), learned conditioning, denoiser creation, noise addition (using a flow-matching-style linear interpolation), dimension fixing (to multiples of 16), memory optimization field listings for medvram mode, and Diffusers weight mapping for joint transformer blocks.

Usage

Use this module when loading and running inference with Stable Diffusion 3 models. It provides the model interface expected by the WebUI's sampling and generation pipeline.

Code Reference

Source Location

Signature

class SD3Denoiser(k_diffusion.external.DiscreteSchedule):
    def __init__(self, inner_model, sigmas) -> None
    def forward(self, input, sigma, **kwargs) -> torch.Tensor

class SD3Inferencer(torch.nn.Module):
    def __init__(self, state_dict, shift=3, use_ema=False) -> None
    def cond_stage_model (property) -> SD3Cond
    def before_load_weights(self, state_dict) -> None
    def ema_scope(self) -> contextlib.nullcontext
    def get_learned_conditioning(self, batch: list[str]) -> object
    def apply_model(self, x, t, cond) -> torch.Tensor
    def decode_first_stage(self, latent) -> torch.Tensor
    def encode_first_stage(self, image) -> torch.Tensor
    def get_first_stage_encoding(self, x) -> torch.Tensor
    def create_denoiser(self) -> SD3Denoiser
    def medvram_fields(self) -> list[tuple]
    def add_noise_to_latent(self, x, noise, amount) -> torch.Tensor
    def fix_dimensions(self, width, height) -> tuple[int, int]
    def diffusers_weight_mapping(self) -> Generator[tuple[str, str]]

Import

from modules.models.sd3.sd3_model import SD3Inferencer

I/O Contract

Inputs

Name Type Required Description
state_dict dict Yes Model state dictionary containing pretrained weights
shift int No Noise schedule shift parameter (default 3)
use_ema bool No Whether to use exponential moving average weights (default False)
x torch.Tensor Yes Input latent tensor for apply_model
t torch.Tensor Yes Timestep or sigma values
cond dict Yes Conditioning dictionary with "crossattn" and "vector" keys
batch list[str] Yes List of text prompts for get_learned_conditioning

Outputs

Name Type Description
prediction torch.Tensor Model prediction (noise or denoised output) from apply_model
decoded torch.Tensor Decoded image tensor from decode_first_stage
encoded torch.Tensor Encoded latent tensor from encode_first_stage
denoiser SD3Denoiser A denoiser instance compatible with k-diffusion sampling

Usage Examples

from modules.models.sd3.sd3_model import SD3Inferencer

# Initialize with pretrained weights
model = SD3Inferencer(state_dict=weights, shift=3)

# Get text conditioning
cond = model.get_learned_conditioning(["a photo of a cat"])

# Create denoiser for sampling
denoiser = model.create_denoiser()

# Encode an image to latent space
latent = model.encode_first_stage(image_tensor)

# Decode latent back to image
image = model.decode_first_stage(latent)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment