Implementation:Deepseek ai Janus AutoencoderKL Decode

Knowledge Sources	Janus Diffusers AutoencoderKL
Domains	Image_Generation, Generative_Models
Last Updated	2026-02-10 09:30 GMT

Overview

HuggingFace Diffusers AutoencoderKL.decode method used to convert denoised latents to pixel images in the JanusFlow pipeline.

Description

The AutoencoderKL from the diffusers library provides the SDXL VAE decoder. In JanusFlow, the denoised latent is divided by the VAE's config.scaling_factor (0.13025) before being passed to decode(). The output is in range [-1, 1].

External Reference

Diffusers AutoencoderKL Documentation

Usage

Call after the ODE denoising loop completes. The VAE must be loaded separately from the main model.

Code Reference

Source Location

Repository: External — HuggingFace Diffusers
Usage reference: demo/app_janusflow.py:L134

Signature

# Called via: vae.decode(z / vae.config.scaling_factor).sample
AutoencoderKL.decode(
    z: torch.Tensor,  # [B, 4, 48, 48] latent divided by scaling_factor
) -> DecoderOutput  # .sample gives [B, 3, 384, 384]

Import

from diffusers.models import AutoencoderKL

I/O Contract

Inputs

Name	Type	Required	Description
z	torch.Tensor [B, 4, 48, 48]	Yes	Denoised latent divided by scaling_factor (0.13025)

Outputs

Name	Type	Description
decoded_image	torch.Tensor [B, 3, 384, 384]	Pixel images in range [-1, 1]

Usage Examples

VAE Decode After ODE Loop

# z: denoised latent [5, 4, 48, 48] from ODE loop
decoded_image = vae.decode(z / vae.config.scaling_factor).sample
# decoded_image shape: [5, 3, 384, 384], range [-1, 1]

Related Pages

Implements Principle

Principle:Deepseek_ai_Janus_VAE_Decoding

Requires Environment

Uses Heuristic

Heuristic:Deepseek_ai_Janus_Bfloat16_Dtype_Selection

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment