Implementation:Norrrrrrr lyn WAInjectBench OpenCLIP Init
| Knowledge Sources | |
|---|---|
| Domains | Computer_Vision, Representation_Learning |
| Last Updated | 2026-02-14 16:00 GMT |
Overview
Concrete tool for loading the ViT-B-32 CLIP model with LAION-2B weights for image embedding, provided by the open_clip library as used in the WAInjectBench image embedding trainer.
Description
The image embedding training script uses open_clip.create_model_and_transforms to load a ViT-B-32 model with laion2b_s34b_b79k pre-trained weights. The function returns a triplet: (model, _, preprocess), where the model produces 512-dim embeddings and preprocess is the required image transform. The model is moved to the target device (CUDA or CPU).
Usage
Initialize once before processing multiple JSONL training files. The model and preprocess transform are shared across all image embedding extraction calls.
Code Reference
Source Location
- Repository: WAInjectBench
- File: train/embedding-i.py (L79-83)
Signature
model, _, preprocess = open_clip.create_model_and_transforms(
"ViT-B-32",
pretrained="laion2b_s34b_b79k"
)
model = model.to(device)
Import
import open_clip
import torch
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model_name | str | Yes | Model architecture (hardcoded: "ViT-B-32") |
| pretrained | str | Yes | Pre-trained weights identifier (hardcoded: "laion2b_s34b_b79k") |
| device | str | No | Target device (default "cuda", falls back to "cpu") |
Outputs
| Name | Type | Description |
|---|---|---|
| model | open_clip.CLIP | Loaded CLIP model producing 512-dim image embeddings |
| preprocess | torchvision.transforms.Compose | Image preprocessing pipeline for the model |
Usage Examples
Initializing the Image Embedder
import open_clip
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model, _, preprocess = open_clip.create_model_and_transforms(
"ViT-B-32",
pretrained="laion2b_s34b_b79k"
)
model = model.to(device)
# Verify: model.visual.output_dim == 512
print(f"Embedding dim: {model.visual.output_dim}")