Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Openai CLIP Dataset Preparation Wrapper

From Leeroopedia
Knowledge Sources
Domains Vision, Data_Engineering
Last Updated 2026-02-13 22:00 GMT

Overview

Wrapper documentation for using torchvision datasets and PyTorch DataLoader with CLIP preprocessing transforms.

Description

This wrapper documents how to use torchvision.datasets (e.g. CIFAR100, ImageNet) and torch.utils.data.DataLoader in the context of CLIP linear probe evaluation. The CLIP repository does not define its own dataset classes; instead, it relies on standard PyTorch data utilities with the CLIP preprocess transform injected as the dataset's transform parameter.

The pattern is demonstrated in the CLIP README (lines 141-191) using CIFAR-100 as the benchmark dataset.

External Reference

Usage

Use this wrapper whenever preparing a dataset for CLIP feature extraction. The key integration point is passing the preprocess transform from clip.load() to the dataset's transform parameter.

Code Reference

Source Location

  • Repository: External (torchvision, PyTorch)
  • Usage pattern: README.md (lines 141-191)

Signature

# torchvision dataset construction
torchvision.datasets.CIFAR100(
    root: str,
    train: bool = True,
    transform: Optional[Callable] = None,   # <- inject CLIP preprocess here
    target_transform: Optional[Callable] = None,
    download: bool = False
) -> Dataset

# PyTorch DataLoader
torch.utils.data.DataLoader(
    dataset: Dataset,
    batch_size: int = 1,
    shuffle: bool = False,
    num_workers: int = 0,
    pin_memory: bool = False
) -> DataLoader

Import

from torchvision.datasets import CIFAR100
from torch.utils.data import DataLoader

I/O Contract

Inputs

Name Type Required Description
root str Yes Download/cache directory for the dataset (e.g. os.path.expanduser("~/.cache"))
train bool Yes True for training split, False for test split
transform Callable Yes The preprocess transform returned by clip.load()
download bool No Whether to download if not already present. Default: False
batch_size int No Number of samples per batch for DataLoader. Default: 1
num_workers int No Number of parallel data loading workers. Default: 0

Outputs

Name Type Description
dataloader DataLoader Iterator yielding (images: torch.Tensor [B, 3, n_px, n_px], labels: torch.Tensor [B]) batches

Usage Examples

CIFAR-100 for Linear Probe

import os
import clip
from torchvision.datasets import CIFAR100
from torch.utils.data import DataLoader

# Load model and get preprocessing transform
model, preprocess = clip.load("ViT-B/32", device="cuda")

# Create datasets with CLIP preprocessing
root = os.path.expanduser("~/.cache")
train_dataset = CIFAR100(root, download=True, train=True, transform=preprocess)
test_dataset = CIFAR100(root, download=True, train=False, transform=preprocess)

# Create dataloaders
train_loader = DataLoader(train_dataset, batch_size=100, num_workers=2)
test_loader = DataLoader(test_dataset, batch_size=100, num_workers=2)

# Iterate
for images, labels in train_loader:
    images = images.to("cuda")  # [100, 3, 224, 224]
    # ... extract features

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment