Implementation:Deepseek ai Janus JanusFlow Load Model

Knowledge Sources	Janus
Domains	Multimodal_AI, Model_Loading
Last Updated	2026-02-10 09:30 GMT

Overview

Concrete tool for loading the JanusFlow multimodal model, processor, and SDXL VAE provided by the Janus repository and diffusers library.

Description

Loading a JanusFlow model requires three separate from_pretrained calls:

MultiModalityCausalLM.from_pretrained(model_path) — loads the JanusFlow model with UViT encoder/decoder, linear aligners, and LLM backbone
VLChatProcessor.from_pretrained(model_path) — loads the JanusFlow processor with tokenizer
AutoencoderKL.from_pretrained(vae_path) — loads the SDXL VAE for pixel decoding

All components must be cast to bfloat16 and moved to CUDA.

Usage

Call these three loading functions at the start of any JanusFlow generation script.

Code Reference

Source Location

Repository: Janus
File: janus/janusflow/models/modeling_vlm.py:L132-169 (MultiModalityCausalLM.__init__)
File: janus/janusflow/models/processing_vlm.py:L72-152 (VLChatProcessor.__init__)
Reference: demo/app_janusflow.py:L11-20

Signature

from janus.janusflow.models import MultiModalityCausalLM, VLChatProcessor
from diffusers.models import AutoencoderKL

# Model loading
vl_chat_processor = VLChatProcessor.from_pretrained(model_path: str)
vl_gpt = MultiModalityCausalLM.from_pretrained(model_path: str)
vae = AutoencoderKL.from_pretrained(vae_path: str)

Import

from janus.janusflow.models import MultiModalityCausalLM, VLChatProcessor
from diffusers.models import AutoencoderKL

I/O Contract

Inputs

Name	Type	Required	Description
model_path	str	Yes	HuggingFace model ID (e.g., "deepseek-ai/JanusFlow-1.3B")
vae_path	str	Yes	VAE model ID (e.g., "stabilityai/sdxl-vae")

Outputs

Name	Type	Description
vl_chat_processor	VLChatProcessor	Processor with tokenizer and image_gen_tag
vl_gpt	MultiModalityCausalLM	JanusFlow model with UViT components, in bfloat16 on CUDA
vae	AutoencoderKL	SDXL VAE decoder, in bfloat16 on CUDA

Usage Examples

Standard JanusFlow Loading

import torch
from janus.janusflow.models import MultiModalityCausalLM, VLChatProcessor
from diffusers.models import AutoencoderKL

model_path = "deepseek-ai/JanusFlow-1.3B"
vl_chat_processor = VLChatProcessor.from_pretrained(model_path)
tokenizer = vl_chat_processor.tokenizer

vl_gpt = MultiModalityCausalLM.from_pretrained(model_path)
vl_gpt = vl_gpt.to(torch.bfloat16).cuda().eval()

# SDXL VAE — must use bfloat16 (fp16 not supported for this VAE)
vae = AutoencoderKL.from_pretrained("stabilityai/sdxl-vae")
vae = vae.to(torch.bfloat16).cuda().eval()

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment