Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:AUTOMATIC1111 Stable diffusion webui VectorQuantizer2

From Leeroopedia


Knowledge Sources
Domains Vector_Quantization, VQ_VAE, LDSR
Last Updated 2025-05-15 00:00 GMT

Overview

Implements an improved vector quantizer module for VQ-VAE models, vendored from the CompVis taming-transformers repository.

Description

VectorQuantizer2 is a PyTorch nn.Module that performs vector quantization by maintaining a learned codebook of embedding vectors. Given a continuous latent tensor, it finds the nearest codebook entry for each spatial position and returns the quantized representation along with a commitment loss. The implementation avoids costly matrix multiplications by using efficient distance computation via the identity (z - e)^2 = z^2 + e^2 - 2*e*z. It supports post-hoc index remapping to a subset of used codebook entries, configurable beta for the commitment loss term, and a legacy mode for backward compatibility with an earlier bug in the loss computation. The module also provides methods for converting between indices and quantized latent vectors (get_codebook_entry).

Usage

This module is used internally by the VQModel autoencoder in the LDSR extension. It is instantiated as the quantization layer within VQModel and is not typically used directly by end users. It provides the core vector quantization logic needed for VQ-VAE based image generation and super-resolution pipelines.

Code Reference

Source Location

Signature

class VectorQuantizer2(nn.Module):
    def __init__(self, n_e, e_dim, beta, remap=None, unknown_index="random",
                 sane_index_shape=False, legacy=True):

    def remap_to_used(self, inds):
    def unmap_to_all(self, inds):
    def forward(self, z, temp=None, rescale_logits=False, return_logits=False):
    def get_codebook_entry(self, indices, shape):

Import

from vqvae_quantize import VectorQuantizer2

I/O Contract

Inputs

Name Type Required Description
n_e int Yes Number of embedding vectors in the codebook
e_dim int Yes Dimensionality of each embedding vector
beta float Yes Commitment loss weight factor
remap str No Path to a numpy file containing used codebook indices for remapping
unknown_index str or int No Strategy for handling unknown indices during remapping ("random", "extra", or integer)
sane_index_shape bool No If True, returns indices shaped as (batch, height, width) instead of flattened
legacy bool No If True, uses the original (buggy) beta term ordering for backward compatibility
z torch.Tensor Yes Input latent tensor of shape (batch, channels, height, width) for forward pass

Outputs

Name Type Description
z_q torch.Tensor Quantized latent tensor of shape (batch, channels, height, width)
loss torch.Tensor Commitment loss scalar combining encoder and codebook losses
info tuple Tuple of (perplexity, min_encodings, min_encoding_indices)

Usage Examples

from vqvae_quantize import VectorQuantizer2

# Create a quantizer with 8192 codebook entries of dimension 4
quantizer = VectorQuantizer2(n_e=8192, e_dim=4, beta=0.25)

# Quantize a latent tensor (batch=1, channels=4, height=32, width=32)
z_q, loss, (perplexity, min_encodings, min_encoding_indices) = quantizer(z)

# Retrieve codebook entries from indices
z_q = quantizer.get_codebook_entry(indices, shape=(1, 32, 32, 4))

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment