Implementation:Deepseek ai Janus VQModel Decode Code

Knowledge Sources	Janus
Domains	Image_Generation, Generative_Models
Last Updated	2026-02-10 09:30 GMT

Overview

Concrete tool for decoding VQ codebook indices into pixel images provided by the Janus VQModel.

Description

The VQModel.decode_code method takes a tensor of codebook indices, looks up their embedding vectors via VectorQuantizer.get_codebook_entry, and passes them through the CNN decoder (post_quant_conv → Decoder) to reconstruct pixel images. The VQ-VAE is stored as the gen_vision_model attribute of MultiModalityCausalLM.

Usage

Call this method after the autoregressive generation loop produces VQ token indices. The shape parameter must match the spatial arrangement of the codebook entries.

Code Reference

Source Location

Repository: Janus
File: janus/models/vq_model.py
Lines: L505-508 (decode_code), L500-503 (decode), L284-299 (get_codebook_entry), L466-513 (VQModel class)

Signature

class VQModel(nn.Module):
    def decode_code(
        self,
        code_b: torch.LongTensor,       # [B, N] codebook indices
        shape: List[int] = None,         # [B, C, H, W] spatial reshape target
        channel_first: bool = True,
    ) -> torch.Tensor:
        """
        Decode VQ codebook indices to pixel images.

        Args:
            code_b: Codebook indices [B, 576]
            shape: Reshape target [B, 8, 24, 24] for 384px/16patch
            channel_first: Whether output is NCHW (default True)

        Returns:
            Decoded images [B, 3, img_size, img_size] in range [-1, 1]
        """

Import

# Accessed via model attribute:
# vl_gpt.gen_vision_model.decode_code(...)

I/O Contract

Inputs

Name	Type	Required	Description
code_b	torch.LongTensor [B, 576]	Yes	VQ codebook indices from generation loop
shape	List[int]	No	Spatial reshape: [B, 8, img_size//patch_size, img_size//patch_size]
channel_first	bool	No	Output channel order (default True = NCHW)

Outputs

Name	Type	Description
decoded	torch.Tensor [B, 3, img_size, img_size]	Decoded pixel images in range [-1, 1]

Usage Examples

Decode Generated Tokens

# After autoregressive generation produces generated_tokens [16, 576]
img_size = 384
patch_size = 16

dec = vl_gpt.gen_vision_model.decode_code(
    generated_tokens.to(dtype=torch.int),
    shape=[parallel_size, 8, img_size // patch_size, img_size // patch_size]
)
# dec shape: [16, 3, 384, 384], range [-1, 1]

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment