Implementation:AUTOMATIC1111 Stable diffusion webui Decode latent batch for img2img

Knowledge Sources	stable-diffusion-webui
Domains	Variational Autoencoders, Image Generation, Inpainting, Image Processing
Last Updated	2026-02-08 00:00 GMT

Overview

Concrete tool for decoding latent tensors back to pixel space with NaN detection and automatic dtype recovery, followed by mask overlay compositing and color correction for img2img output assembly, provided by the AUTOMATIC1111 stable-diffusion-webui repository.

Description

The decode_latent_batch() function decodes a batch of latent tensors one sample at a time through the VAE decoder, with built-in NaN detection and automatic dtype recovery. After decoding, the process_images() function applies a multi-stage post-processing pipeline that includes face restoration, color correction, and overlay compositing specifically designed for img2img and inpainting workflows.

decode_latent_batch(): This function iterates over the batch dimension, decoding each latent sample individually via decode_first_stage(). For each decoded sample, if check_for_nans is True, it tests for NaN values. If NaNs are detected and auto-correction is enabled, the VAE is automatically converted to either bfloat16 or float32, and the decode is retried. Successfully decoded samples are optionally moved to a target device (typically CPU) to free GPU memory. The function returns a DecodedSamples object (a list subclass with an already_decoded flag).

Post-processing pipeline (in process_images()): After decoding, the following steps are applied per-image:

Pixel conversion: Decoded tensors are rescaled from [-1,1] to [0,1], clamped, converted to uint8 numpy arrays, and wrapped as PIL Images.
Face restoration: If enabled, modules.face_restoration.restore_faces() is applied to improve facial features.
Script postprocessing: postprocess_image() callbacks allow extensions to modify each image.
Mask overlay handling: For inpainting, the mask and overlay image are retrieved and optionally modified by postprocess_maskoverlay() callbacks.
Color correction: If color corrections were captured during init(), apply_color_correction() matches the LAB histogram of the generated image to the source using skimage.exposure.match_histograms(), then applies luminosity blending via blendLayers().
Overlay compositing: apply_overlay() handles un-cropping (for inpaint-full-res mode) and alpha-compositing the overlay image (containing original unmasked regions) over the generated image using Image.alpha_composite().
Post-composite script hook: postprocess_image_after_composite() allows final modifications.
Mask output: Optionally, the mask image and a mask composite visualization are generated and added to the output.

Usage

decode_latent_batch() is called by process_images() after sampling completes. The post-processing pipeline runs automatically for each generated image. Understanding this pipeline is important for:

Debugging color shifts between source and generated images (check color correction settings)
Diagnosing inpainting seam artifacts (check overlay compositing and mask blur)
Understanding NaN-related generation failures (check VAE dtype auto-correction)
Extending the pipeline via script hooks

Code Reference

Source Location

Repository: stable-diffusion-webui
File: modules/processing.py
Lines: 625-672 (decode_latent_batch), 995-1007 (decode invocation and normalization), 1046-1106 (post-processing with overlay and color correction)
Also: modules/processing.py lines 49-62 (apply_color_correction), 65-72 (uncrop), 75-88 (apply_overlay)

Signature

def decode_latent_batch(
    model,
    batch,
    target_device=None,
    check_for_nans=False
) -> DecodedSamples:

def apply_color_correction(correction, original_image) -> PIL.Image:

def apply_overlay(image, paste_loc, overlay) -> tuple[PIL.Image, PIL.Image]:

Import

from modules.processing import decode_latent_batch, DecodedSamples
from modules.processing import apply_color_correction, apply_overlay, uncrop

I/O Contract

Inputs

decode_latent_batch:

Name	Type	Required	Description
model	StableDiffusionModel	Yes	The Stable Diffusion model containing the VAE first_stage_model
batch	torch.Tensor	Yes	Latent tensor batch, shape [B, C, H/8, W/8]
target_device	torch.device	No	Device to move decoded samples to (default None, stays on current device)
check_for_nans	bool	No	Whether to check for NaN values and attempt auto-correction (default False)

apply_color_correction:

Name	Type	Required	Description
correction	np.ndarray	Yes	LAB-space histogram target captured from source image via setup_color_correction()
original_image	PIL.Image	Yes	The generated image to apply color correction to

apply_overlay:

Name	Type	Required	Description
image	PIL.Image	Yes	The generated image
paste_loc	tuple or None	No	(x, y, w, h) paste coordinates for inpaint-full-res un-cropping
overlay	PIL.Image or None	No	RGBA overlay image with original unmasked regions

Outputs

decode_latent_batch:

Name	Type	Description
samples	DecodedSamples	List of decoded sample tensors, each shape [3, H, W] in range [-1, 1]. Has already_decoded=True attribute.

apply_color_correction:

Name	Type	Description
image	PIL.Image	Color-corrected RGB image with matched LAB histogram

apply_overlay:

Name	Type	Description
image	PIL.Image	Final composited image with overlay applied
original_denoised_image	PIL.Image	Copy of the image before overlay compositing (for mask composite visualization)

Usage Examples

Basic Usage

from modules.processing import decode_latent_batch

# Decode latent samples after img2img sampling
decoded_samples = decode_latent_batch(
    model=shared.sd_model,
    batch=latent_samples,          # shape [1, 4, 64, 64]
    target_device=torch.device('cpu'),
    check_for_nans=True,
)

# decoded_samples is a list of tensors, each shape [3, 512, 512]
# Convert to PIL image:
import numpy as np
from PIL import Image

sample = decoded_samples[0]
x = torch.clamp((sample + 1.0) / 2.0, min=0.0, max=1.0)
x = 255.0 * np.moveaxis(x.cpu().numpy(), 0, 2)
image = Image.fromarray(x.astype(np.uint8))

Overlay Compositing for Inpainting

from modules.processing import apply_overlay, apply_color_correction

# Apply color correction to match source image colors
if color_corrections is not None:
    image = apply_color_correction(color_corrections[i], image)

# Composite original unmasked regions over generated image
# paste_loc is (x, y, w, h) from inpaint-full-res cropping
# overlay is RGBA image with transparency in masked regions
final_image, original_denoised = apply_overlay(image, paste_loc, overlay)

# For mask composite visualization:
from PIL import Image as PILImage
mask_composite = PILImage.composite(
    original_denoised.convert('RGBA').convert('RGBa'),
    PILImage.new('RGBa', image.size),
    mask_for_overlay.convert('L')
).convert('RGBA')

Related Pages

Implements Principle

Principle:AUTOMATIC1111_Stable_diffusion_webui_VAE_decoding_and_output_composition

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment