Principle:Deepseek ai Janus VAE Decoding
| Knowledge Sources | |
|---|---|
| Domains | Image_Generation, Generative_Models |
| Last Updated | 2026-02-10 09:30 GMT |
Overview
A procedure for converting continuous latent representations into pixel images using the SDXL VAE decoder.
Description
VAE (Variational Autoencoder) decoding is the final image reconstruction step in the JanusFlow pipeline. After the ODE denoising loop produces a clean latent representation, the SDXL VAE's decoder converts it from the 4-channel, 48×48 latent space to a 3-channel, 384×384 pixel image.
The latent must be divided by the VAE's scaling_factor (0.13025 for SDXL) before decoding to account for the normalization applied during training.
Usage
Use this principle after the ODE denoising loop completes and before post-processing the output images.
Theoretical Basis
The VAE decoder maps from latent space to pixel space:
Where σ = 0.13025 is the SDXL VAE scaling factor and z is the denoised latent from the ODE loop.
The SDXL VAE has an 8× spatial downscaling factor, so 48×48 latents produce 384×384 pixel images.