Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Deepseek ai Janus VQModel Decode Code

From Leeroopedia


Knowledge Sources
Domains Image_Generation, Generative_Models
Last Updated 2026-02-10 09:30 GMT

Overview

Concrete tool for decoding VQ codebook indices into pixel images provided by the Janus VQModel.

Description

The VQModel.decode_code method takes a tensor of codebook indices, looks up their embedding vectors via VectorQuantizer.get_codebook_entry, and passes them through the CNN decoder (post_quant_conv → Decoder) to reconstruct pixel images. The VQ-VAE is stored as the gen_vision_model attribute of MultiModalityCausalLM.

Usage

Call this method after the autoregressive generation loop produces VQ token indices. The shape parameter must match the spatial arrangement of the codebook entries.

Code Reference

Source Location

  • Repository: Janus
  • File: janus/models/vq_model.py
  • Lines: L505-508 (decode_code), L500-503 (decode), L284-299 (get_codebook_entry), L466-513 (VQModel class)

Signature

class VQModel(nn.Module):
    def decode_code(
        self,
        code_b: torch.LongTensor,       # [B, N] codebook indices
        shape: List[int] = None,         # [B, C, H, W] spatial reshape target
        channel_first: bool = True,
    ) -> torch.Tensor:
        """
        Decode VQ codebook indices to pixel images.

        Args:
            code_b: Codebook indices [B, 576]
            shape: Reshape target [B, 8, 24, 24] for 384px/16patch
            channel_first: Whether output is NCHW (default True)

        Returns:
            Decoded images [B, 3, img_size, img_size] in range [-1, 1]
        """

Import

# Accessed via model attribute:
# vl_gpt.gen_vision_model.decode_code(...)

I/O Contract

Inputs

Name Type Required Description
code_b torch.LongTensor [B, 576] Yes VQ codebook indices from generation loop
shape List[int] No Spatial reshape: [B, 8, img_size//patch_size, img_size//patch_size]
channel_first bool No Output channel order (default True = NCHW)

Outputs

Name Type Description
decoded torch.Tensor [B, 3, img_size, img_size] Decoded pixel images in range [-1, 1]

Usage Examples

Decode Generated Tokens

# After autoregressive generation produces generated_tokens [16, 576]
img_size = 384
patch_size = 16

dec = vl_gpt.gen_vision_model.decode_code(
    generated_tokens.to(dtype=torch.int),
    shape=[parallel_size, 8, img_size // patch_size, img_size // patch_size]
)
# dec shape: [16, 3, 384, 384], range [-1, 1]

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment