Environment:Deepseek ai Janus JanusFlow Diffusers Environment

Knowledge Sources	Janus Diffusers Stability AI SDXL VAE
Domains	Infrastructure, Deep_Learning, Generative_Models
Last Updated	2026-02-10 09:30 GMT

Overview

Extended environment for JanusFlow rectified flow image generation, requiring the diffusers library and the Stability AI SDXL VAE model.

Description

JanusFlow uses a rectified flow ODE solver for image generation instead of the autoregressive VQ-token approach used by Janus/Janus-Pro. This requires the diffusers library to load the AutoencoderKL (SDXL VAE) model from Stability AI. The SDXL VAE operates in a continuous latent space (4-channel, 48x48) rather than discrete VQ tokens. This environment extends the base CUDA GPU environment with the additional diffusers dependency.

Usage

Use this environment specifically for the Rectified Flow Image Generation workflow (JanusFlow). It is not required for Multimodal Understanding or Autoregressive Image Generation workflows. The SDXL VAE must be loaded separately from the main JanusFlow model.

System Requirements

Category	Requirement	Notes
OS	Linux (Ubuntu recommended)	Same as base CUDA GPU environment
Hardware	NVIDIA GPU with bfloat16 support	SDXL VAE requires bfloat16; fp16 is explicitly unsupported
VRAM	Additional ~2GB for SDXL VAE	On top of JanusFlow model VRAM requirements
Network	Internet access for initial model download	Downloads `stabilityai/sdxl-vae` from HuggingFace Hub

Dependencies

System Packages

All packages from Environment:Deepseek_ai_Janus_CUDA_GPU_Environment

Python Packages

All packages from Environment:Deepseek_ai_Janus_CUDA_GPU_Environment
diffusers (with torch extra): `pip install diffusers[torch]`

Credentials

No additional credentials are required. The SDXL VAE model (stabilityai/sdxl-vae) is publicly available on HuggingFace Hub.

Quick Install

# Install base Janus dependencies
pip install -e .

# Install diffusers for JanusFlow
pip install diffusers[torch]

Code Evidence

SDXL VAE loading with bfloat16 requirement from `demo/app_janusflow.py:18-20`:

# remember to use bfloat16 dtype, this vae doesn't work with fp16
vae = AutoencoderKL.from_pretrained("stabilityai/sdxl-vae")
vae = vae.to(torch.bfloat16).to(cuda_device).eval()

Diffusers import from `demo/app_janusflow.py:5`:

from diffusers.models import AutoencoderKL

VAE decode with scaling factor from `demo/app_janusflow.py:134`:

decoded_image = vae.decode(z / vae.config.scaling_factor).sample

README installation instructions for JanusFlow from `README.md:513-514`:

pip install -e .
pip install diffusers[torch]

Common Errors

Error Message	Cause	Solution
`ImportError: No module named 'diffusers'`	diffusers not installed	`pip install diffusers[torch]`
Model produces corrupted/black images	VAE loaded with fp16 instead of bfloat16	Ensure VAE is loaded with `torch.bfloat16` — fp16 is explicitly unsupported
`OSError: stabilityai/sdxl-vae not found`	No internet access or HuggingFace Hub unreachable	Download model manually and provide local path

Compatibility Notes

bfloat16 only: The SDXL VAE is explicitly documented as incompatible with fp16. The comment in `demo/app_janusflow.py:18` states: "remember to use bfloat16 dtype, this vae doesn't work with fp16".
Separate model download: The SDXL VAE is loaded separately from the JanusFlow model using `AutoencoderKL.from_pretrained("stabilityai/sdxl-vae")`, not bundled with the main model weights.
Latent space dimensions: The VAE operates on 4-channel latent tensors of size 48x48 (for 384x384 output images).

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment