Implementation:AUTOMATIC1111 Stable diffusion webui Decode latent batch
| Knowledge Sources | |
|---|---|
| Domains | Diffusion Models, Variational Autoencoders, Image Post-Processing |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete tool for decoding a batch of latent tensors from the diffusion model's latent space into pixel-space image tensors, with NaN detection and automatic precision recovery, provided by the AUTOMATIC1111 stable-diffusion-webui repository.
Description
decode_latent_batch() takes a batch of latent tensors and decodes them one-by-one through the Stable Diffusion model's VAE (first_stage_model). The function implements several safety features:
- Per-sample decoding -- Each sample in the batch is decoded individually (rather than as a full batch) to manage GPU memory usage.
- NaN detection -- When
check_for_nans=True, both the input UNet output and the decoded VAE output are tested for NaN values usingdevices.test_for_nans().
- Automatic precision recovery -- If NaN values are detected in the VAE output, the function attempts automatic recovery:
- First checks if
auto_vae_precision_bfloat16is enabled and tries bfloat16 - Otherwise checks if
auto_vae_precisionis enabled and tries float32 - Converts the VAE to the new dtype and re-decodes the current sample
- If already at the target dtype or no auto-fix is enabled, raises the NaN exception
- First checks if
- Device transfer -- Decoded samples can optionally be moved to a target device (typically CPU) to free GPU memory.
The function returns a DecodedSamples object (a list subclass with already_decoded = True), which signals to downstream code that the samples are in pixel space rather than latent space.
The complementary save_image() function at modules/images.py handles saving decoded images to disk with full metadata embedding, filename generation, and format-specific handling.
Usage
This function is called at the end of the sampling pipeline:
- In
StableDiffusionProcessingTxt2Img.sample()after the first pass (when hires fix is needed for pixel-space upscaling) - In
sample_hr_pass()after the second denoising pass - In
process_images_inner()as the final decoding step before image saving
Code Reference
Source Location
- Repository: stable-diffusion-webui
- File:
modules/processing.py - Lines: 625-672
- Image saving:
modules/images.py, Lines: 624-700
Signature
def decode_latent_batch(model, batch, target_device=None, check_for_nans=False):
"""
Decode a batch of latent tensors into pixel-space image tensors.
Args:
model: The Stable Diffusion model containing the VAE (first_stage_model).
batch: Latent tensor of shape (B, 4, H/8, W/8) to decode.
target_device: Optional device to move decoded samples to (e.g., torch.device('cpu')).
check_for_nans: If True, test for NaN values in both UNet output and VAE output,
with automatic precision recovery.
Returns:
DecodedSamples: A list of decoded image tensors, each of shape (3, H, W).
"""
Image saving signature:
def save_image(image, path, basename, seed=None, prompt=None, extension='png',
info=None, short_filename=False, no_prompt=False, grid=False,
pnginfo_section_name='parameters', p=None, existing_info=None,
forced_filename=None, suffix="", save_to_dirs=None):
"""
Save an image to disk with generation metadata.
Args:
image: PIL.Image to save.
path: Directory path for saving.
basename: Base filename for the pattern.
seed: Generation seed for filename.
prompt: Generation prompt for filename.
extension: Image format extension (default 'png').
info: Generation info string to embed as metadata.
p: StableDiffusionProcessing instance for filename generation.
forced_filename: Override filename (ignores basename and pattern).
save_to_dirs: Whether to save into a subdirectory.
Returns:
(fullfn, txt_fullfn): Tuple of saved image path and optional text file path.
"""
Import
from modules.processing import decode_latent_batch, DecodedSamples
from modules.images import save_image
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model | WebuiSdModel | Yes | The loaded Stable Diffusion model containing the VAE decoder as model.first_stage_model
|
| batch | torch.Tensor | Yes | Latent tensor of shape (B, 4, H/8, W/8) where B is batch size |
| target_device | torch.device or None | No | Device to transfer decoded samples to; typically torch.device('cpu') to free GPU memory. None keeps samples on the current device.
|
| check_for_nans | bool | No | Whether to test for NaN values and attempt automatic precision recovery. Default False. |
Outputs
| Name | Type | Description |
|---|---|---|
| return | DecodedSamples | A list of decoded image tensors, each of shape (3, H, W) with pixel values. The list has attribute already_decoded = True to indicate pixel-space representation.
|
Usage Examples
Basic Usage
from modules.processing import decode_latent_batch
import modules.shared as shared
import modules.devices as devices
# Assume `latent_samples` is a tensor of shape (1, 4, 64, 64)
# from the sampling step (representing a 512x512 image)
decoded = decode_latent_batch(
model=shared.sd_model,
batch=latent_samples,
target_device=devices.cpu,
check_for_nans=True,
)
# decoded is a DecodedSamples list with one tensor of shape (3, 512, 512)
# decoded.already_decoded == True
Saving with Metadata
from modules.images import save_image
from PIL import Image
import numpy as np
# Convert decoded tensor to PIL Image
sample = decoded[0]
sample = torch.clamp((sample + 1.0) / 2.0, min=0.0, max=1.0)
sample = 255.0 * sample.cpu().numpy().transpose(1, 2, 0)
image = Image.fromarray(sample.astype(np.uint8))
# Save with generation metadata
fullfn, txt_fullfn = save_image(
image=image,
path="/output/txt2img-images",
basename="",
seed=42,
prompt="a beautiful sunset",
extension="png",
info="a beautiful sunset\nNegative prompt: blurry\nSteps: 30, Sampler: DPM++ 2M, CFG scale: 7",
)
# fullfn: "/output/txt2img-images/00001-42-a beautiful sunset.png"