Implementation:Openai CLIP Class Label Template Preparation

Knowledge Sources	OpenAI CLIP Learning Transferable Visual Models From Natural Language Supervision
Domains	NLP, Vision, Zero_Shot_Learning
Last Updated	2026-02-13 22:00 GMT

Overview

Pattern documentation for defining class name lists and prompt template collections for CLIP prompt-engineered zero-shot classification.

Description

This is a Pattern Doc documenting the user-defined data structures required for prompt-engineered classification. The CLIP repository provides 80 ImageNet-specific prompt templates in the Prompt Engineering notebook (cell 10) and references 3,401 lines of templates for 26 benchmarks in data/prompts.md. Users must define:

imagenet_classes: A list of 1000 curated class names, modified from standard ImageNet labels for disambiguation (e.g., "tench" stays as "tench", but "nail" becomes "metal nail").
imagenet_templates: A list of 80 prompt template strings, each containing a '{}' placeholder that will be filled with the class name.

These are pure Python data definitions with no library dependencies.

Usage

Define these data structures before constructing zero-shot classifier weights. The class names should match the target dataset's label semantics, and templates should provide diverse contextual framing.

Code Reference

Source Location

Repository: OpenAI CLIP
File: notebooks/Prompt_Engineering_for_ImageNet.ipynb (cells 8 and 10)
Additional reference: data/prompts.md (3,401 lines of templates for 26 benchmarks)

Interface Specification

# Class name list: List[str]
# Each entry is a disambiguated class name
imagenet_classes: List[str] = [
    "tench",
    "goldfish",
    "great white shark",
    # ... 1000 classes total
    "metal nail",           # disambiguated from "nail"
    "kite (bird of prey)",  # disambiguated from "kite"
    # ...
]

# Template list: List[str]
# Each entry contains {} as a placeholder for the class name
imagenet_templates: List[str] = [
    "a bad photo of a {}.",
    "a photo of many {}.",
    "a sculpture of a {}.",
    "a photo of the hard to see {}.",
    "a low resolution photo of the {}.",
    "a rendering of a {}.",
    "graffiti of a {}.",
    "a bad photo of the {}.",
    "a cropped photo of the {}.",
    "a tattoo of the {}.",
    "the embroidered {}.",
    "a photo of a hard to see {}.",
    "a bright photo of a {}.",
    "a photo of a clean {}.",
    "a photo of a dirty {}.",
    "a dark photo of the {}.",
    "a drawing of a {}.",
    "a photo of my {}.",
    "the plastic {}.",
    "a photo of the cool {}.",
    # ... 80 templates total
    "a photo of a {}.",
    "itap of a {}.",  # "I took a picture of a"
]

Import

# No imports required — these are user-defined Python lists
# Typically defined inline in a notebook or script

I/O Contract

Inputs

Name	Type	Required	Description
dataset_classes	source data	Yes	The class label vocabulary of the target dataset (e.g., ImageNet 1000 classes, CIFAR-100 classes)
template_collection	source data	No	Reference templates from data/prompts.md or custom-designed templates

Outputs

Name	Type	Description
classnames	List[str]	Curated, disambiguated class name strings (len = number of classes)
templates	List[str]	Prompt template strings with {} placeholder (len = number of templates, e.g. 80 for ImageNet)

Usage Examples

Defining Custom Classes and Templates

# For a custom dataset with 5 animal classes
classnames = ["cat", "dog", "goldfish", "parrot", "hamster"]

# Simple templates
templates = [
    "a photo of a {}.",
    "a blurry photo of a {}.",
    "a photo of the large {}.",
    "a photo of the small {}.",
    "a photo of a {} in the wild.",
]

# Generate all combinations
for classname in classnames:
    texts = [template.format(classname) for template in templates]
    # e.g., ["a photo of a cat.", "a blurry photo of a cat.", ...]

Using Prompts from data/prompts.md

# The CLIP repo provides templates for 26 benchmarks in data/prompts.md
# Format in prompts.md:
#   ### DatasetName
#   - "template with {}."
#   - "another template with {}."

# For ImageNet, 80 templates are used (defined in notebook cell 10)
# For other datasets, see data/prompts.md for dataset-specific templates

Related Pages

Implements Principle

Principle:Openai_CLIP_Prompt_Engineering

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment