Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Norrrrrrr lyn WAInjectBench Image Feature Extraction

From Leeroopedia
Knowledge Sources
Domains Computer_Vision, Feature_Engineering
Last Updated 2026-02-14 16:00 GMT

Overview

A per-image encoding step that transforms raw image files into L2-normalized CLIP embedding vectors suitable for classifier training.

Description

Image Feature Extraction loads individual images from disk, applies the CLIP preprocessing transform, passes them through the CLIP visual encoder, and L2-normalizes the resulting embeddings. Unlike batch text encoding, image encoding is performed one-at-a-time due to varying image sizes and potential I/O errors. Failed images are replaced with zero vectors to maintain alignment with the label array.

Usage

Use this after loading training data and initializing the CLIP model. It bridges the data loading step and the classifier training step in the image embedding pipeline.

Theoretical Basis

# Per-image CLIP encoding with normalization
image = preprocess(PIL.Image.open(path)).unsqueeze(0)
with torch.no_grad():
    emb = model.encode_image(image)
    emb = emb / emb.norm(dim=-1, keepdim=True)  # L2 normalize

L2 normalization ensures all embeddings lie on a unit hypersphere, making cosine similarity equivalent to dot product and improving classifier performance.

Related Pages

Implemented By

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment