Implementation:Openai CLIP LogisticRegression Wrapper
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, Evaluation, Classification |
| Last Updated | 2026-02-13 22:00 GMT |
Overview
Wrapper documentation for using scikit-learn's LogisticRegression as a linear probe classifier on CLIP features.
Description
This wrapper documents how sklearn.linear_model.LogisticRegression is used in the CLIP linear probe evaluation workflow. The CLIP repository does not define its own classifier; instead, it relies on scikit-learn with specific hyperparameters demonstrated in the README (lines 141-191).
The key CLIP-specific configuration is:
- C=0.316 — Inverse regularization strength (approximately sqrt(0.1))
- max_iter=1000 — Sufficient iterations for convergence on high-dimensional CLIP features
- random_state=0 — Fixed seed for reproducibility
- verbose=1 — Show convergence progress
External Reference
Usage
Use this wrapper after extracting L2-normalized CLIP image features (as numpy arrays) from both train and test splits. Train on the training features, predict on test features, and compute accuracy.
Code Reference
Source Location
- Repository: External (scikit-learn)
- Usage pattern: README.md (lines 141-191)
Signature
sklearn.linear_model.LogisticRegression(
random_state: int = 0,
C: float = 0.316,
max_iter: int = 1000,
verbose: int = 1
)
# Training
classifier.fit(
X: numpy.ndarray, # shape [N_train, embed_dim]
y: numpy.ndarray # shape [N_train]
) -> LogisticRegression
# Prediction
classifier.predict(
X: numpy.ndarray # shape [N_test, embed_dim]
) -> numpy.ndarray # shape [N_test]
Import
from sklearn.linear_model import LogisticRegression
import numpy as np
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| X_train | numpy.ndarray | Yes | L2-normalized training features from CLIP, shape [N_train, embed_dim] |
| y_train | numpy.ndarray | Yes | Training labels, shape [N_train] |
| X_test | numpy.ndarray | Yes | L2-normalized test features from CLIP, shape [N_test, embed_dim] |
| C | float | No | Inverse regularization strength. CLIP uses 0.316. Default: 1.0 |
| max_iter | int | No | Maximum solver iterations. CLIP uses 1000. Default: 100 |
Outputs
| Name | Type | Description |
|---|---|---|
| predictions | numpy.ndarray | Predicted class labels for the test set, shape [N_test] |
| accuracy | float | Classification accuracy: np.mean(predictions == y_test) |
Usage Examples
Complete Linear Probe Evaluation
import os
import clip
import torch
import numpy as np
from sklearn.linear_model import LogisticRegression
from torchvision.datasets import CIFAR100
from torch.utils.data import DataLoader
# 1. Load model
device = "cuda" if torch.cuda.is_available() else "cpu"
model, preprocess = clip.load("ViT-B/32", device=device)
# 2. Prepare datasets
root = os.path.expanduser("~/.cache")
train = CIFAR100(root, download=True, train=True, transform=preprocess)
test = CIFAR100(root, download=True, train=False, transform=preprocess)
# 3. Extract features
def get_features(dataset):
loader = DataLoader(dataset, batch_size=100, num_workers=2)
all_features, all_labels = [], []
with torch.no_grad():
for images, labels in loader:
features = model.encode_image(images.to(device))
features /= features.norm(dim=-1, keepdim=True)
all_features.append(features.cpu().numpy())
all_labels.append(labels.numpy())
return np.concatenate(all_features), np.concatenate(all_labels)
train_features, train_labels = get_features(train)
test_features, test_labels = get_features(test)
# 4. Train classifier and evaluate
classifier = LogisticRegression(random_state=0, C=0.316, max_iter=1000, verbose=1)
classifier.fit(train_features, train_labels)
predictions = classifier.predict(test_features)
accuracy = np.mean(predictions == test_labels)
print(f"Accuracy = {accuracy:.4f}")
# Expected: ~0.7810 for ViT-B/32 on CIFAR-100