Principle:LaurentMazare Tch rs Custom Dataset Loading
| Knowledge Sources | |
|---|---|
| Domains | Computer_Vision, Data_Loading |
| Last Updated | 2026-02-08 14:00 GMT |
Overview
Mechanism for loading custom image classification datasets organized in a directory structure with class-named subdirectories.
Description
Custom dataset loading reads images from a directory hierarchy where each class has its own subdirectory under train/ and val/ folders. All images are loaded, resized to 224x224, and normalized using ImageNet statistics. Class labels are automatically assigned based on directory names (alphabetical ordering). This enables any custom image classification dataset to be used with pretrained ImageNet models for transfer learning.
Usage
Use when fine-tuning pretrained models on custom datasets. Organize your images in the expected directory structure: root/train/class_name/image.jpg and root/val/class_name/image.jpg.
Theoretical Basis
Expected Directory Structure:
dataset_root/
train/
class_a/
img001.jpg
img002.jpg
class_b/
img003.jpg
val/
class_a/
img004.jpg
class_b/
img005.jpg
Loading Process:
1. Enumerate class directories under val/
2. For each class (sorted): assign integer label 0, 1, 2, ...
3. Load all images, resize to 224x224, apply ImageNet normalization
4. Stack into tensors: train_images [N, 3, 224, 224], train_labels [N]