Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Deepseek ai Janus CUDA GPU Environment

From Leeroopedia


Knowledge Sources
Domains Infrastructure, Deep_Learning, Computer_Vision
Last Updated 2026-02-10 09:30 GMT

Overview

NVIDIA CUDA GPU environment with bfloat16 support required for running Janus multimodal understanding and image generation models.

Description

This environment provides the GPU-accelerated context required by all Janus model variants (Janus-1.3B, Janus-Pro-7B, JanusFlow-1.3B). The codebase extensively uses torch.cuda for tensor operations, model placement, and inference. All inference scripts call .cuda() directly on model weights and intermediate tensors. The code checks torch.cuda.is_available() at startup to select between CUDA (with bfloat16) and CPU (with float16) execution paths, but all primary workflows assume CUDA availability.

Usage

Use this environment for all Janus workflows: Multimodal Understanding, Autoregressive Image Generation, and Rectified Flow Image Generation. Every inference script and Gradio demo requires GPU acceleration. CPU fallback exists in some demo scripts but is not the intended execution path.

System Requirements

Category Requirement Notes
OS Linux (Ubuntu recommended) Tested with Python >= 3.8
Hardware NVIDIA GPU with bfloat16 support Ampere (A100) or newer recommended; bfloat16 required for optimal precision
VRAM Minimum 8GB (1.3B models), 24GB+ (7B models) Janus-Pro-7B requires significantly more VRAM
CUDA CUDA toolkit compatible with PyTorch >= 2.0.1 Required for torch.cuda operations

Dependencies

System Packages

  • NVIDIA GPU driver compatible with CUDA toolkit
  • CUDA toolkit (version compatible with torch >= 2.0.1)

Python Packages

  • torch >= 2.0.1
  • transformers >= 4.38.2
  • timm >= 0.9.16
  • accelerate
  • sentencepiece
  • attrdict
  • einops
  • numpy
  • Pillow (PIL)

Credentials

No credentials are required. Models are loaded from HuggingFace public model hub (deepseek-ai/Janus-1.3B, deepseek-ai/Janus-Pro-7B, deepseek-ai/JanusFlow-1.3B) without authentication.

Quick Install

# Install core dependencies
pip install -e .

# Or install manually
pip install torch>=2.0.1 transformers>=4.38.2 timm>=0.9.16 accelerate sentencepiece attrdict einops numpy Pillow

Code Evidence

CUDA availability check and device selection from `demo/app.py:22`:

cuda_device = 'cuda' if torch.cuda.is_available() else 'cpu'

Model loading with bfloat16 and CUDA placement from `inference.py:34`:

vl_gpt = vl_gpt.to(torch.bfloat16).cuda().eval()

Conditional dtype selection (bfloat16 on CUDA, float16 on CPU) from `demo/app_januspro.py:22-25`:

if torch.cuda.is_available():
    vl_gpt = vl_gpt.to(torch.bfloat16).cuda()
else:
    vl_gpt = vl_gpt.to(torch.float16)

Direct CUDA tensor creation from `generation_inference.py:69`:

tokens = torch.zeros((parallel_size*2, len(input_ids)), dtype=torch.int).cuda()

Python 3.10+ compatibility patch from `janus/__init__.py:24-31`:

if sys.version_info >= (3, 10):
    print("Python version is above 3.10, patching the collections module.")
    import collections
    import collections.abc
    for type_name in collections.abc.__all__:
        setattr(collections, type_name, getattr(collections.abc, type_name))

Common Errors

Error Message Cause Solution
`RuntimeError: No CUDA GPUs are available` No NVIDIA GPU detected Ensure NVIDIA drivers and CUDA toolkit are installed; verify with `nvidia-smi`
`RuntimeError: CUDA out of memory` Insufficient VRAM for model size Use a smaller model (1.3B instead of 7B), or reduce `parallel_size` in image generation
`AttributeError: module 'collections' has no attribute 'MutableMapping'` Python 3.10+ removed collections ABC classes Ensure `import janus` is called before other imports (it patches collections)

Compatibility Notes

  • CPU fallback: Demo scripts (app.py, app_januspro.py, fastapi_app.py) include CPU fallback with float16, but this is significantly slower and not the intended execution path.
  • bfloat16 requirement: The SDXL VAE used by JanusFlow specifically requires bfloat16 and does not work with float16 (documented in `demo/app_janusflow.py:18`).
  • Python 3.10+: Both `janus/__init__.py` and `janus/janusflow/__init__.py` include a monkey-patch for the collections module to handle Python 3.10+ deprecations.
  • Eager attention: Demo scripts explicitly set `language_config._attn_implementation = 'eager'` to avoid flash attention requirements.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment