Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:AUTOMATIC1111 Stable diffusion webui VQModel Autoencoder

From Leeroopedia


Knowledge Sources
Domains Autoencoder, VQ_VAE, LDSR
Last Updated 2025-05-15 00:00 GMT

Overview

Provides VQModel and VQModelInterface classes that are hijacked back into the ldm.models.autoencoder module to support the LDSR upscaler extension.

Description

This module re-introduces the VQModel and VQModelInterface classes that were originally present in the CompVis stable-diffusion repository but were removed when the codebase migrated to the stability-ai/stablediffusion repository. The VQModel class is a PyTorch Lightning module implementing a Vector Quantized Variational Autoencoder (VQ-VAE) with an encoder, decoder, and vector quantizer. It supports EMA (Exponential Moving Average) weights, checkpoint loading, configurable loss functions with a discriminator, batch resizing, and learning rate scheduling. The VQModelInterface subclass simplifies the encode/decode interface by deferring quantization to the decode step, optionally allowing forced skip of quantization. At module load time, both classes are monkey-patched into ldm.models.autoencoder to restore compatibility with the LDSR upscaler.

Usage

This module is used internally by the LDSR (Latent Diffusion Super Resolution) built-in extension. It is loaded automatically when the LDSR upscaler is invoked, ensuring that the required VQModel and VQModelInterface classes are available in the ldm.models.autoencoder namespace. End users do not interact with this module directly; it is a compatibility shim for the LDSR pipeline.

Code Reference

Source Location

Signature

class VQModel(pl.LightningModule):
    def __init__(self, ddconfig, lossconfig, n_embed, embed_dim, ckpt_path=None,
                 ignore_keys=None, image_key="image", colorize_nlabels=None,
                 monitor=None, batch_resize_range=None, scheduler_config=None,
                 lr_g_factor=1.0, remap=None, sane_index_shape=False, use_ema=False):

    def encode(self, x):
    def encode_to_prequant(self, x):
    def decode(self, quant):
    def decode_code(self, code_b):
    def forward(self, input, return_pred_indices=False):

class VQModelInterface(VQModel):
    def __init__(self, embed_dim, *args, **kwargs):
    def encode(self, x):
    def decode(self, h, force_not_quantize=False):

Import

from ldm.models.autoencoder import VQModel, VQModelInterface

I/O Contract

Inputs

Name Type Required Description
ddconfig dict Yes Configuration dictionary for the encoder and decoder architecture
lossconfig dict Yes Configuration for the loss function, instantiated via instantiate_from_config
n_embed int Yes Number of embedding vectors in the codebook
embed_dim int Yes Dimensionality of each embedding vector
ckpt_path str No Path to a checkpoint file for weight initialization
ignore_keys list No List of key prefixes to ignore when loading checkpoint state dict
image_key str No Key to extract images from batch dict, defaults to "image"
use_ema bool No Whether to use Exponential Moving Average weights

Outputs

Name Type Description
dec torch.Tensor Decoded/reconstructed image tensor
diff torch.Tensor Quantization embedding loss
ind torch.Tensor Predicted codebook indices (when return_pred_indices=True)

Usage Examples

# The module auto-hijacks on import; VQModel becomes available at:
import ldm.models.autoencoder
model = ldm.models.autoencoder.VQModel(
    ddconfig=dd_config,
    lossconfig=loss_config,
    n_embed=8192,
    embed_dim=4
)

# Encode and decode an image
quant, emb_loss, info = model.encode(image_tensor)
reconstructed = model.decode(quant)

# Using VQModelInterface (deferred quantization)
interface = ldm.models.autoencoder.VQModelInterface(embed_dim=4, ddconfig=dd_config, lossconfig=loss_config, n_embed=8192)
h = interface.encode(image_tensor)
decoded = interface.decode(h, force_not_quantize=False)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment