Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:AUTOMATIC1111 Stable diffusion webui Img2img sample

From Leeroopedia


Knowledge Sources
Domains Diffusion Models, Image Generation, Image Editing, Inpainting
Last Updated 2026-02-08 00:00 GMT

Overview

Concrete tool for performing noise injection, text-guided denoising, and mask-based latent compositing in the image-to-image sampling pipeline provided by the AUTOMATIC1111 stable-diffusion-webui repository.

Description

The sample() method of StableDiffusionProcessingImg2Img implements the core SDEdit sampling loop for image-to-image generation. It generates noise, optionally scales it by initial_noise_multiplier, invokes the sampler to denoise the noisy latent under text guidance, and then composites the result with the original latent using the inpainting mask.

The method proceeds through these stages:

1. Noise generation: A noise tensor x is drawn from the seeded ImageRNG instance via self.rng.next(). If initial_noise_multiplier is not 1.0, the noise is scaled and this parameter is recorded in extra_generation_params.

2. Script pre-processing: The process_before_every_sampling script callback is invoked, passing the init_latent, noise, and conditioning tensors. This allows extensions to modify any of these before sampling begins.

3. Sampler invocation: self.sampler.sample_img2img() is called with:

  • self.init_latent as the encoded source image
  • x as the noise tensor
  • conditioning and unconditional_conditioning for classifier-free guidance
  • self.image_conditioning for inpainting-model-specific conditioning

The sampler internally determines the starting timestep from self.denoising_strength, adds noise to the init_latent, and runs the iterative denoising loop.

4. Mask compositing: If self.mask is not None (inpainting mode), the method blends the denoised samples with the original init_latent:

blended_samples = samples * self.nmask + self.init_latent * self.mask

The on_mask_blend script hook is then called with a MaskBlendArgs object, allowing extensions to override the blending result.

5. Cleanup: The noise tensor is deleted and GPU memory is freed via devices.torch_gc().

Usage

This method is called by process_images() during each batch iteration. It is invoked within a torch.no_grad() context and either an autocast or without-autocast context depending on the UNet dtype requirements. The returned tensor is subsequently decoded by the VAE to produce pixel-space images.

Code Reference

Source Location

Signature

def sample(
    self,
    conditioning,
    unconditional_conditioning,
    seeds,
    subseeds,
    subseed_strength,
    prompts
) -> torch.Tensor:

Import

from modules.processing import StableDiffusionProcessingImg2Img
# sample() is called as a method: p.sample(conditioning, unconditional_conditioning, seeds, subseeds, subseed_strength, prompts)

I/O Contract

Inputs

Name Type Required Description
conditioning tuple Yes Positive text conditioning tensors (output of prompt encoding)
unconditional_conditioning tuple Yes Negative/unconditional text conditioning tensors
seeds list[int] Yes Per-image seeds for the current batch
subseeds list[int] Yes Per-image subseeds for the current batch
subseed_strength float Yes Strength of subseed blending (0.0 = no blending)
prompts list[str] Yes Text prompts for the current batch

Instance state read:

Name Type Description
self.rng ImageRNG Seeded random number generator for noise
self.init_latent torch.Tensor VAE-encoded source image latent
self.initial_noise_multiplier float Noise amplitude scaling factor
self.sampler Sampler The selected diffusion sampler instance
self.image_conditioning torch.Tensor Inpainting model conditioning
self.mask torch.Tensor or None Latent mask (1.0 in preserved regions)
self.nmask torch.Tensor or None Inverse latent mask (1.0 in regenerated regions)
self.scripts ScriptRunner Script runner for callbacks

Outputs

Name Type Description
samples torch.Tensor Denoised latent tensor, shape [B, C, H/8, W/8], ready for VAE decoding

Usage Examples

Basic Usage

# Called internally by process_images() in the generation loop:
# (shown here for reference; normally not called directly)

# After p.init() has been called and conditioning is set up:
samples = p.sample(
    conditioning=p.c,
    unconditional_conditioning=p.uc,
    seeds=p.seeds,
    subseeds=p.subseeds,
    subseed_strength=p.subseed_strength,
    prompts=p.prompts,
)

# samples is a torch.Tensor of shape [batch_size, 4, height//8, width//8]
# It is then passed to decode_latent_batch() for VAE decoding

Understanding the Mask Compositing

# The mask compositing ensures unmasked regions are preserved:
# Given:
#   self.nmask: 1.0 where content should be regenerated
#   self.mask:  1.0 where content should be preserved
#   self.init_latent: original encoded image

# After denoising:
blended = samples * self.nmask + self.init_latent * self.mask

# Example for a 50% mask:
# Regenerated pixel: 1.0 * denoised + 0.0 * original = denoised
# Preserved pixel:   0.0 * denoised + 1.0 * original = original

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment