Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Openai CLIP PyTorch CUDA Runtime

From Leeroopedia
Knowledge Sources
Domains Infrastructure, Computer_Vision
Last Updated 2026-02-13 22:00 GMT

Overview

Linux or macOS environment with PyTorch >= 1.7.1, optional CUDA GPU support, and torchvision for running OpenAI CLIP models.

Description

This environment provides the core runtime for loading and running CLIP models. It requires PyTorch 1.7.1 or later with matching torchvision. When a CUDA-capable GPU is available, CLIP automatically places the model on GPU and runs inference in fp16 (half precision); on CPU, the model is cast to fp32. The CI matrix tests against PyTorch 1.7.1, 1.9.1, and 1.10.1 on Python 3.8 with CPU-only builds.

Usage

Use this environment for all CLIP workflows: zero-shot image classification, linear-probe evaluation, and prompt-engineered classification. Every Implementation page in this wiki requires this runtime as the base layer.

System Requirements

Category Requirement Notes
OS Linux (Ubuntu recommended) or macOS CI uses `ubuntu-latest`
Hardware NVIDIA GPU (optional) CUDA acceleration; CPU fallback supported
Hardware GPU VRAM varies by model ViT-B/32 ~338MB weights; ViT-L/14@336px ~900MB weights
Disk ~2GB free For model cache in `~/.cache/clip`
Python 3.8+ CI tests on Python 3.8

Dependencies

System Packages

  • CUDA toolkit (optional, for GPU acceleration)
  • `conda` or `pip` (package manager)

Python Packages

  • `torch` >= 1.7.1 (warning emitted if older)
  • `torchvision` >= 0.8.2 (must match torch version)
  • `numpy` (transitive dependency of torch)
  • `Pillow` >= 5.3.0 (transitive dependency of torchvision)

Credentials

No credentials or API keys are required. Model weights are downloaded from public Azure CDN endpoints (`openaipublic.azureedge.net`) without authentication.

Quick Install

# GPU install (CUDA 11.0 example)
conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=11.0

# CPU-only install
conda install --yes -c pytorch pytorch=1.7.1 torchvision cpuonly

# Or via pip (auto-selects CUDA if available)
pip install torch torchvision

Code Evidence

PyTorch version check from `clip/clip.py:23-24`:

if version.parse(torch.__version__) < version.parse("1.7.1"):
    warnings.warn("PyTorch version 1.7.1 or higher is recommended")

Automatic CUDA/CPU device selection from `clip/clip.py:94`:

def load(name: str, device: Union[str, torch.device] = "cuda" if torch.cuda.is_available() else "cpu", jit: bool = False, download_root: str = None):

Dtype handling for CPU vs GPU from `clip/clip.py:140-141`:

if str(device) == "cpu":
    model.float()

Torch version-conditional dtype in tokenize from `clip/clip.py:231-234`:

if version.parse(torch.__version__) < version.parse("1.8.0"):
    result = torch.zeros(len(all_tokens), context_length, dtype=torch.long)
else:
    result = torch.zeros(len(all_tokens), context_length, dtype=torch.int)

CI test matrix from `.github/workflows/test.yml:13-25`:

matrix:
  python-version: [3.8]
  pytorch-version: [1.7.1, 1.9.1, 1.10.1]
  include:
    - python-version: 3.8
      pytorch-version: 1.7.1
      torchvision-version: 0.8.2
    - python-version: 3.8
      pytorch-version: 1.9.1
      torchvision-version: 0.10.1
    - python-version: 3.8
      pytorch-version: 1.10.1
      torchvision-version: 0.11.2

Common Errors

Error Message Cause Solution
`UserWarning: PyTorch version 1.7.1 or higher is recommended` PyTorch version below 1.7.1 Upgrade PyTorch: `pip install torch>=1.7.1`
`RuntimeError: Model {name} not found` Invalid model name passed to `clip.load()` Use one of `clip.available_models()`: RN50, RN101, RN50x4, RN50x16, RN50x64, ViT-B/32, ViT-B/16, ViT-L/14, ViT-L/14@336px
`RuntimeError: Model has been downloaded but the SHA256 checksum does not not match` Corrupt or incomplete download Delete the cached file in `~/.cache/clip/` and re-download
`RuntimeError: {path} exists and is not a regular file` Download target path is a directory Remove the conflicting directory at the path

Compatibility Notes

  • CPU-only: Fully supported. Models auto-cast to fp32. JIT models require additional dtype patching (handled internally by `clip.load()`).
  • CUDA GPU: Models run in fp16 by default for memory efficiency. No minimum compute capability specified.
  • torchvision InterpolationMode: Older torchvision versions lack `InterpolationMode` enum; CLIP falls back to `Image.BICUBIC` from PIL (`clip/clip.py:16-20`).
  • torch < 1.8.0: Tokenizer returns `LongTensor` instead of `IntTensor` due to older `index_select` requirements (`clip/clip.py:231-234`).
  • PyTorch Hub: CLIP can be loaded via `torch.hub.load()` using `hubconf.py`, which declares dependencies: `torch`, `torchvision`, `ftfy`, `regex`, `tqdm`.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment