Environment:Deepseek ai Janus JanusFlow Diffusers Environment
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Deep_Learning, Generative_Models |
| Last Updated | 2026-02-10 09:30 GMT |
Overview
Extended environment for JanusFlow rectified flow image generation, requiring the diffusers library and the Stability AI SDXL VAE model.
Description
JanusFlow uses a rectified flow ODE solver for image generation instead of the autoregressive VQ-token approach used by Janus/Janus-Pro. This requires the diffusers library to load the AutoencoderKL (SDXL VAE) model from Stability AI. The SDXL VAE operates in a continuous latent space (4-channel, 48x48) rather than discrete VQ tokens. This environment extends the base CUDA GPU environment with the additional diffusers dependency.
Usage
Use this environment specifically for the Rectified Flow Image Generation workflow (JanusFlow). It is not required for Multimodal Understanding or Autoregressive Image Generation workflows. The SDXL VAE must be loaded separately from the main JanusFlow model.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Linux (Ubuntu recommended) | Same as base CUDA GPU environment |
| Hardware | NVIDIA GPU with bfloat16 support | SDXL VAE requires bfloat16; fp16 is explicitly unsupported |
| VRAM | Additional ~2GB for SDXL VAE | On top of JanusFlow model VRAM requirements |
| Network | Internet access for initial model download | Downloads `stabilityai/sdxl-vae` from HuggingFace Hub |
Dependencies
System Packages
- All packages from Environment:Deepseek_ai_Janus_CUDA_GPU_Environment
Python Packages
- All packages from Environment:Deepseek_ai_Janus_CUDA_GPU_Environment
- diffusers (with torch extra): `pip install diffusers[torch]`
Credentials
No additional credentials are required. The SDXL VAE model (stabilityai/sdxl-vae) is publicly available on HuggingFace Hub.
Quick Install
# Install base Janus dependencies
pip install -e .
# Install diffusers for JanusFlow
pip install diffusers[torch]
Code Evidence
SDXL VAE loading with bfloat16 requirement from `demo/app_janusflow.py:18-20`:
# remember to use bfloat16 dtype, this vae doesn't work with fp16
vae = AutoencoderKL.from_pretrained("stabilityai/sdxl-vae")
vae = vae.to(torch.bfloat16).to(cuda_device).eval()
Diffusers import from `demo/app_janusflow.py:5`:
from diffusers.models import AutoencoderKL
VAE decode with scaling factor from `demo/app_janusflow.py:134`:
decoded_image = vae.decode(z / vae.config.scaling_factor).sample
README installation instructions for JanusFlow from `README.md:513-514`:
pip install -e .
pip install diffusers[torch]
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `ImportError: No module named 'diffusers'` | diffusers not installed | `pip install diffusers[torch]` |
| Model produces corrupted/black images | VAE loaded with fp16 instead of bfloat16 | Ensure VAE is loaded with `torch.bfloat16` — fp16 is explicitly unsupported |
| `OSError: stabilityai/sdxl-vae not found` | No internet access or HuggingFace Hub unreachable | Download model manually and provide local path |
Compatibility Notes
- bfloat16 only: The SDXL VAE is explicitly documented as incompatible with fp16. The comment in `demo/app_janusflow.py:18` states: "remember to use bfloat16 dtype, this vae doesn't work with fp16".
- Separate model download: The SDXL VAE is loaded separately from the JanusFlow model using `AutoencoderKL.from_pretrained("stabilityai/sdxl-vae")`, not bundled with the main model weights.
- Latent space dimensions: The VAE operates on 4-channel latent tensors of size 48x48 (for 384x384 output images).