Implementation:Openai CLIP Dataset Preparation Wrapper

Knowledge Sources	OpenAI CLIP torchvision.datasets PyTorch DataLoader
Domains	Vision, Data_Engineering
Last Updated	2026-02-13 22:00 GMT

Overview

Wrapper documentation for using torchvision datasets and PyTorch DataLoader with CLIP preprocessing transforms.

Description

This wrapper documents how to use torchvision.datasets (e.g. CIFAR100, ImageNet) and torch.utils.data.DataLoader in the context of CLIP linear probe evaluation. The CLIP repository does not define its own dataset classes; instead, it relies on standard PyTorch data utilities with the CLIP preprocess transform injected as the dataset's transform parameter.

The pattern is demonstrated in the CLIP README (lines 141-191) using CIFAR-100 as the benchmark dataset.

External Reference

Usage

Use this wrapper whenever preparing a dataset for CLIP feature extraction. The key integration point is passing the preprocess transform from clip.load() to the dataset's transform parameter.

Code Reference

Source Location

Repository: External (torchvision, PyTorch)
Usage pattern: README.md (lines 141-191)

Signature

# torchvision dataset construction
torchvision.datasets.CIFAR100(
    root: str,
    train: bool = True,
    transform: Optional[Callable] = None,   # <- inject CLIP preprocess here
    target_transform: Optional[Callable] = None,
    download: bool = False
) -> Dataset

# PyTorch DataLoader
torch.utils.data.DataLoader(
    dataset: Dataset,
    batch_size: int = 1,
    shuffle: bool = False,
    num_workers: int = 0,
    pin_memory: bool = False
) -> DataLoader

Import

from torchvision.datasets import CIFAR100
from torch.utils.data import DataLoader

I/O Contract

Inputs

Name	Type	Required	Description
root	str	Yes	Download/cache directory for the dataset (e.g. os.path.expanduser("~/.cache"))
train	bool	Yes	True for training split, False for test split
transform	Callable	Yes	The preprocess transform returned by clip.load()
download	bool	No	Whether to download if not already present. Default: False
batch_size	int	No	Number of samples per batch for DataLoader. Default: 1
num_workers	int	No	Number of parallel data loading workers. Default: 0

Outputs

Name	Type	Description
dataloader	DataLoader	Iterator yielding (images: torch.Tensor [B, 3, n_px, n_px], labels: torch.Tensor [B]) batches

Usage Examples

CIFAR-100 for Linear Probe

import os
import clip
from torchvision.datasets import CIFAR100
from torch.utils.data import DataLoader

# Load model and get preprocessing transform
model, preprocess = clip.load("ViT-B/32", device="cuda")

# Create datasets with CLIP preprocessing
root = os.path.expanduser("~/.cache")
train_dataset = CIFAR100(root, download=True, train=True, transform=preprocess)
test_dataset = CIFAR100(root, download=True, train=False, transform=preprocess)

# Create dataloaders
train_loader = DataLoader(train_dataset, batch_size=100, num_workers=2)
test_loader = DataLoader(test_dataset, batch_size=100, num_workers=2)

# Iterate
for images, labels in train_loader:
    images = images.to("cuda")  # [100, 3, 224, 224]
    # ... extract features

Related Pages

Implements Principle

Principle:Openai_CLIP_Dataset_Preparation

Requires Environment

Environment:Openai_CLIP_Python_Dependencies

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment