Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:AUTOMATIC1111 Stable diffusion webui Decode latent batch

From Leeroopedia


Knowledge Sources
Domains Diffusion Models, Variational Autoencoders, Image Post-Processing
Last Updated 2026-02-08 00:00 GMT

Overview

Concrete tool for decoding a batch of latent tensors from the diffusion model's latent space into pixel-space image tensors, with NaN detection and automatic precision recovery, provided by the AUTOMATIC1111 stable-diffusion-webui repository.

Description

decode_latent_batch() takes a batch of latent tensors and decodes them one-by-one through the Stable Diffusion model's VAE (first_stage_model). The function implements several safety features:

  1. Per-sample decoding -- Each sample in the batch is decoded individually (rather than as a full batch) to manage GPU memory usage.
  1. NaN detection -- When check_for_nans=True, both the input UNet output and the decoded VAE output are tested for NaN values using devices.test_for_nans().
  1. Automatic precision recovery -- If NaN values are detected in the VAE output, the function attempts automatic recovery:
    • First checks if auto_vae_precision_bfloat16 is enabled and tries bfloat16
    • Otherwise checks if auto_vae_precision is enabled and tries float32
    • Converts the VAE to the new dtype and re-decodes the current sample
    • If already at the target dtype or no auto-fix is enabled, raises the NaN exception
  1. Device transfer -- Decoded samples can optionally be moved to a target device (typically CPU) to free GPU memory.

The function returns a DecodedSamples object (a list subclass with already_decoded = True), which signals to downstream code that the samples are in pixel space rather than latent space.

The complementary save_image() function at modules/images.py handles saving decoded images to disk with full metadata embedding, filename generation, and format-specific handling.

Usage

This function is called at the end of the sampling pipeline:

  • In StableDiffusionProcessingTxt2Img.sample() after the first pass (when hires fix is needed for pixel-space upscaling)
  • In sample_hr_pass() after the second denoising pass
  • In process_images_inner() as the final decoding step before image saving

Code Reference

Source Location

  • Repository: stable-diffusion-webui
  • File: modules/processing.py
  • Lines: 625-672
  • Image saving: modules/images.py, Lines: 624-700

Signature

def decode_latent_batch(model, batch, target_device=None, check_for_nans=False):
    """
    Decode a batch of latent tensors into pixel-space image tensors.

    Args:
        model: The Stable Diffusion model containing the VAE (first_stage_model).
        batch: Latent tensor of shape (B, 4, H/8, W/8) to decode.
        target_device: Optional device to move decoded samples to (e.g., torch.device('cpu')).
        check_for_nans: If True, test for NaN values in both UNet output and VAE output,
                        with automatic precision recovery.

    Returns:
        DecodedSamples: A list of decoded image tensors, each of shape (3, H, W).
    """

Image saving signature:

def save_image(image, path, basename, seed=None, prompt=None, extension='png',
               info=None, short_filename=False, no_prompt=False, grid=False,
               pnginfo_section_name='parameters', p=None, existing_info=None,
               forced_filename=None, suffix="", save_to_dirs=None):
    """
    Save an image to disk with generation metadata.

    Args:
        image: PIL.Image to save.
        path: Directory path for saving.
        basename: Base filename for the pattern.
        seed: Generation seed for filename.
        prompt: Generation prompt for filename.
        extension: Image format extension (default 'png').
        info: Generation info string to embed as metadata.
        p: StableDiffusionProcessing instance for filename generation.
        forced_filename: Override filename (ignores basename and pattern).
        save_to_dirs: Whether to save into a subdirectory.

    Returns:
        (fullfn, txt_fullfn): Tuple of saved image path and optional text file path.
    """

Import

from modules.processing import decode_latent_batch, DecodedSamples
from modules.images import save_image

I/O Contract

Inputs

Name Type Required Description
model WebuiSdModel Yes The loaded Stable Diffusion model containing the VAE decoder as model.first_stage_model
batch torch.Tensor Yes Latent tensor of shape (B, 4, H/8, W/8) where B is batch size
target_device torch.device or None No Device to transfer decoded samples to; typically torch.device('cpu') to free GPU memory. None keeps samples on the current device.
check_for_nans bool No Whether to test for NaN values and attempt automatic precision recovery. Default False.

Outputs

Name Type Description
return DecodedSamples A list of decoded image tensors, each of shape (3, H, W) with pixel values. The list has attribute already_decoded = True to indicate pixel-space representation.

Usage Examples

Basic Usage

from modules.processing import decode_latent_batch
import modules.shared as shared
import modules.devices as devices

# Assume `latent_samples` is a tensor of shape (1, 4, 64, 64)
# from the sampling step (representing a 512x512 image)

decoded = decode_latent_batch(
    model=shared.sd_model,
    batch=latent_samples,
    target_device=devices.cpu,
    check_for_nans=True,
)
# decoded is a DecodedSamples list with one tensor of shape (3, 512, 512)
# decoded.already_decoded == True

Saving with Metadata

from modules.images import save_image
from PIL import Image
import numpy as np

# Convert decoded tensor to PIL Image
sample = decoded[0]
sample = torch.clamp((sample + 1.0) / 2.0, min=0.0, max=1.0)
sample = 255.0 * sample.cpu().numpy().transpose(1, 2, 0)
image = Image.fromarray(sample.astype(np.uint8))

# Save with generation metadata
fullfn, txt_fullfn = save_image(
    image=image,
    path="/output/txt2img-images",
    basename="",
    seed=42,
    prompt="a beautiful sunset",
    extension="png",
    info="a beautiful sunset\nNegative prompt: blurry\nSteps: 30, Sampler: DPM++ 2M, CFG scale: 7",
)
# fullfn: "/output/txt2img-images/00001-42-a beautiful sunset.png"

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment