Implementation:Openai CLIP LogisticRegression Wrapper

Knowledge Sources	OpenAI CLIP scikit-learn LogisticRegression
Domains	Machine_Learning, Evaluation, Classification
Last Updated	2026-02-13 22:00 GMT

Overview

Wrapper documentation for using scikit-learn's LogisticRegression as a linear probe classifier on CLIP features.

Description

This wrapper documents how sklearn.linear_model.LogisticRegression is used in the CLIP linear probe evaluation workflow. The CLIP repository does not define its own classifier; instead, it relies on scikit-learn with specific hyperparameters demonstrated in the README (lines 141-191).

The key CLIP-specific configuration is:

C=0.316 — Inverse regularization strength (approximately sqrt(0.1))
max_iter=1000 — Sufficient iterations for convergence on high-dimensional CLIP features
random_state=0 — Fixed seed for reproducibility
verbose=1 — Show convergence progress

External Reference

scikit-learn LogisticRegression Documentation

Usage

Use this wrapper after extracting L2-normalized CLIP image features (as numpy arrays) from both train and test splits. Train on the training features, predict on test features, and compute accuracy.

Code Reference

Source Location

Repository: External (scikit-learn)
Usage pattern: README.md (lines 141-191)

Signature

sklearn.linear_model.LogisticRegression(
    random_state: int = 0,
    C: float = 0.316,
    max_iter: int = 1000,
    verbose: int = 1
)

# Training
classifier.fit(
    X: numpy.ndarray,  # shape [N_train, embed_dim]
    y: numpy.ndarray   # shape [N_train]
) -> LogisticRegression

# Prediction
classifier.predict(
    X: numpy.ndarray   # shape [N_test, embed_dim]
) -> numpy.ndarray     # shape [N_test]

Import

from sklearn.linear_model import LogisticRegression
import numpy as np

I/O Contract

Inputs

Name	Type	Required	Description
X_train	numpy.ndarray	Yes	L2-normalized training features from CLIP, shape [N_train, embed_dim]
y_train	numpy.ndarray	Yes	Training labels, shape [N_train]
X_test	numpy.ndarray	Yes	L2-normalized test features from CLIP, shape [N_test, embed_dim]
C	float	No	Inverse regularization strength. CLIP uses 0.316. Default: 1.0
max_iter	int	No	Maximum solver iterations. CLIP uses 1000. Default: 100

Outputs

Name	Type	Description
predictions	numpy.ndarray	Predicted class labels for the test set, shape [N_test]
accuracy	float	Classification accuracy: np.mean(predictions == y_test)

Usage Examples

Complete Linear Probe Evaluation

import os
import clip
import torch
import numpy as np
from sklearn.linear_model import LogisticRegression
from torchvision.datasets import CIFAR100
from torch.utils.data import DataLoader

# 1. Load model
device = "cuda" if torch.cuda.is_available() else "cpu"
model, preprocess = clip.load("ViT-B/32", device=device)

# 2. Prepare datasets
root = os.path.expanduser("~/.cache")
train = CIFAR100(root, download=True, train=True, transform=preprocess)
test = CIFAR100(root, download=True, train=False, transform=preprocess)

# 3. Extract features
def get_features(dataset):
    loader = DataLoader(dataset, batch_size=100, num_workers=2)
    all_features, all_labels = [], []
    with torch.no_grad():
        for images, labels in loader:
            features = model.encode_image(images.to(device))
            features /= features.norm(dim=-1, keepdim=True)
            all_features.append(features.cpu().numpy())
            all_labels.append(labels.numpy())
    return np.concatenate(all_features), np.concatenate(all_labels)

train_features, train_labels = get_features(train)
test_features, test_labels = get_features(test)

# 4. Train classifier and evaluate
classifier = LogisticRegression(random_state=0, C=0.316, max_iter=1000, verbose=1)
classifier.fit(train_features, train_labels)

predictions = classifier.predict(test_features)
accuracy = np.mean(predictions == test_labels)
print(f"Accuracy = {accuracy:.4f}")
# Expected: ~0.7810 for ViT-B/32 on CIFAR-100

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment