Implementation:AUTOMATIC1111 Stable diffusion webui VectorQuantizer2

Knowledge Sources	AUTOMATIC1111_Stable_diffusion_webui
Domains	Vector_Quantization, VQ_VAE, LDSR
Last Updated	2025-05-15 00:00 GMT

Overview

Implements an improved vector quantizer module for VQ-VAE models, vendored from the CompVis taming-transformers repository.

Description

VectorQuantizer2 is a PyTorch nn.Module that performs vector quantization by maintaining a learned codebook of embedding vectors. Given a continuous latent tensor, it finds the nearest codebook entry for each spatial position and returns the quantized representation along with a commitment loss. The implementation avoids costly matrix multiplications by using efficient distance computation via the identity (z - e)^2 = z^2 + e^2 - 2*e*z. It supports post-hoc index remapping to a subset of used codebook entries, configurable beta for the commitment loss term, and a legacy mode for backward compatibility with an earlier bug in the loss computation. The module also provides methods for converting between indices and quantized latent vectors (get_codebook_entry).

Usage

This module is used internally by the VQModel autoencoder in the LDSR extension. It is instantiated as the quantization layer within VQModel and is not typically used directly by end users. It provides the core vector quantization logic needed for VQ-VAE based image generation and super-resolution pipelines.

Code Reference

Source Location

Repository: AUTOMATIC1111_Stable_diffusion_webui
File: extensions-builtin/LDSR/vqvae_quantize.py
Lines: 1-147

Signature

class VectorQuantizer2(nn.Module):
    def __init__(self, n_e, e_dim, beta, remap=None, unknown_index="random",
                 sane_index_shape=False, legacy=True):

    def remap_to_used(self, inds):
    def unmap_to_all(self, inds):
    def forward(self, z, temp=None, rescale_logits=False, return_logits=False):
    def get_codebook_entry(self, indices, shape):

Import

from vqvae_quantize import VectorQuantizer2

I/O Contract

Inputs

Name	Type	Required	Description
n_e	int	Yes	Number of embedding vectors in the codebook
e_dim	int	Yes	Dimensionality of each embedding vector
beta	float	Yes	Commitment loss weight factor
remap	str	No	Path to a numpy file containing used codebook indices for remapping
unknown_index	str or int	No	Strategy for handling unknown indices during remapping ("random", "extra", or integer)
sane_index_shape	bool	No	If True, returns indices shaped as (batch, height, width) instead of flattened
legacy	bool	No	If True, uses the original (buggy) beta term ordering for backward compatibility
z	torch.Tensor	Yes	Input latent tensor of shape (batch, channels, height, width) for forward pass

Outputs

Name	Type	Description
z_q	torch.Tensor	Quantized latent tensor of shape (batch, channels, height, width)
loss	torch.Tensor	Commitment loss scalar combining encoder and codebook losses
info	tuple	Tuple of (perplexity, min_encodings, min_encoding_indices)

Usage Examples

from vqvae_quantize import VectorQuantizer2

# Create a quantizer with 8192 codebook entries of dimension 4
quantizer = VectorQuantizer2(n_e=8192, e_dim=4, beta=0.25)

# Quantize a latent tensor (batch=1, channels=4, height=32, width=32)
z_q, loss, (perplexity, min_encodings, min_encoding_indices) = quantizer(z)

# Retrieve codebook entries from indices
z_q = quantizer.get_codebook_entry(indices, shape=(1, 32, 32, 4))

Related Pages

Principle:AUTOMATIC1111_Stable_diffusion_webui_VQ_VAE_Quantization

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment