Implementation:AUTOMATIC1111 Stable diffusion webui Decode latent batch for img2img
| Knowledge Sources | |
|---|---|
| Domains | Variational Autoencoders, Image Generation, Inpainting, Image Processing |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete tool for decoding latent tensors back to pixel space with NaN detection and automatic dtype recovery, followed by mask overlay compositing and color correction for img2img output assembly, provided by the AUTOMATIC1111 stable-diffusion-webui repository.
Description
The decode_latent_batch() function decodes a batch of latent tensors one sample at a time through the VAE decoder, with built-in NaN detection and automatic dtype recovery. After decoding, the process_images() function applies a multi-stage post-processing pipeline that includes face restoration, color correction, and overlay compositing specifically designed for img2img and inpainting workflows.
decode_latent_batch(): This function iterates over the batch dimension, decoding each latent sample individually via decode_first_stage(). For each decoded sample, if check_for_nans is True, it tests for NaN values. If NaNs are detected and auto-correction is enabled, the VAE is automatically converted to either bfloat16 or float32, and the decode is retried. Successfully decoded samples are optionally moved to a target device (typically CPU) to free GPU memory. The function returns a DecodedSamples object (a list subclass with an already_decoded flag).
Post-processing pipeline (in process_images()): After decoding, the following steps are applied per-image:
- Pixel conversion: Decoded tensors are rescaled from [-1,1] to [0,1], clamped, converted to uint8 numpy arrays, and wrapped as PIL Images.
- Face restoration: If enabled,
modules.face_restoration.restore_faces()is applied to improve facial features. - Script postprocessing:
postprocess_image()callbacks allow extensions to modify each image. - Mask overlay handling: For inpainting, the mask and overlay image are retrieved and optionally modified by
postprocess_maskoverlay()callbacks. - Color correction: If color corrections were captured during
init(),apply_color_correction()matches the LAB histogram of the generated image to the source usingskimage.exposure.match_histograms(), then applies luminosity blending viablendLayers(). - Overlay compositing:
apply_overlay()handles un-cropping (for inpaint-full-res mode) and alpha-compositing the overlay image (containing original unmasked regions) over the generated image usingImage.alpha_composite(). - Post-composite script hook:
postprocess_image_after_composite()allows final modifications. - Mask output: Optionally, the mask image and a mask composite visualization are generated and added to the output.
Usage
decode_latent_batch() is called by process_images() after sampling completes. The post-processing pipeline runs automatically for each generated image. Understanding this pipeline is important for:
- Debugging color shifts between source and generated images (check color correction settings)
- Diagnosing inpainting seam artifacts (check overlay compositing and mask blur)
- Understanding NaN-related generation failures (check VAE dtype auto-correction)
- Extending the pipeline via script hooks
Code Reference
Source Location
- Repository: stable-diffusion-webui
- File:
modules/processing.py - Lines: 625-672 (decode_latent_batch), 995-1007 (decode invocation and normalization), 1046-1106 (post-processing with overlay and color correction)
- Also:
modules/processing.pylines 49-62 (apply_color_correction), 65-72 (uncrop), 75-88 (apply_overlay)
Signature
def decode_latent_batch(
model,
batch,
target_device=None,
check_for_nans=False
) -> DecodedSamples:
def apply_color_correction(correction, original_image) -> PIL.Image:
def apply_overlay(image, paste_loc, overlay) -> tuple[PIL.Image, PIL.Image]:
Import
from modules.processing import decode_latent_batch, DecodedSamples
from modules.processing import apply_color_correction, apply_overlay, uncrop
I/O Contract
Inputs
decode_latent_batch:
| Name | Type | Required | Description |
|---|---|---|---|
| model | StableDiffusionModel | Yes | The Stable Diffusion model containing the VAE first_stage_model |
| batch | torch.Tensor | Yes | Latent tensor batch, shape [B, C, H/8, W/8] |
| target_device | torch.device | No | Device to move decoded samples to (default None, stays on current device) |
| check_for_nans | bool | No | Whether to check for NaN values and attempt auto-correction (default False) |
apply_color_correction:
| Name | Type | Required | Description |
|---|---|---|---|
| correction | np.ndarray | Yes | LAB-space histogram target captured from source image via setup_color_correction() |
| original_image | PIL.Image | Yes | The generated image to apply color correction to |
apply_overlay:
| Name | Type | Required | Description |
|---|---|---|---|
| image | PIL.Image | Yes | The generated image |
| paste_loc | tuple or None | No | (x, y, w, h) paste coordinates for inpaint-full-res un-cropping |
| overlay | PIL.Image or None | No | RGBA overlay image with original unmasked regions |
Outputs
decode_latent_batch:
| Name | Type | Description |
|---|---|---|
| samples | DecodedSamples | List of decoded sample tensors, each shape [3, H, W] in range [-1, 1]. Has already_decoded=True attribute. |
apply_color_correction:
| Name | Type | Description |
|---|---|---|
| image | PIL.Image | Color-corrected RGB image with matched LAB histogram |
apply_overlay:
| Name | Type | Description |
|---|---|---|
| image | PIL.Image | Final composited image with overlay applied |
| original_denoised_image | PIL.Image | Copy of the image before overlay compositing (for mask composite visualization) |
Usage Examples
Basic Usage
from modules.processing import decode_latent_batch
# Decode latent samples after img2img sampling
decoded_samples = decode_latent_batch(
model=shared.sd_model,
batch=latent_samples, # shape [1, 4, 64, 64]
target_device=torch.device('cpu'),
check_for_nans=True,
)
# decoded_samples is a list of tensors, each shape [3, 512, 512]
# Convert to PIL image:
import numpy as np
from PIL import Image
sample = decoded_samples[0]
x = torch.clamp((sample + 1.0) / 2.0, min=0.0, max=1.0)
x = 255.0 * np.moveaxis(x.cpu().numpy(), 0, 2)
image = Image.fromarray(x.astype(np.uint8))
Overlay Compositing for Inpainting
from modules.processing import apply_overlay, apply_color_correction
# Apply color correction to match source image colors
if color_corrections is not None:
image = apply_color_correction(color_corrections[i], image)
# Composite original unmasked regions over generated image
# paste_loc is (x, y, w, h) from inpaint-full-res cropping
# overlay is RGBA image with transparency in masked regions
final_image, original_denoised = apply_overlay(image, paste_loc, overlay)
# For mask composite visualization:
from PIL import Image as PILImage
mask_composite = PILImage.composite(
original_denoised.convert('RGBA').convert('RGBa'),
PILImage.new('RGBa', image.size),
mask_for_overlay.convert('L')
).convert('RGBA')