Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:PeterL1n BackgroundMattingV2 ImagesDataset

From Leeroopedia


Knowledge Sources
Domains Data_Loading, Computer_Vision
Last Updated 2026-02-09 00:00 GMT

Overview

Concrete tool for loading images from a directory tree provided by dataset/images.py.

Description

ImagesDataset is a PyTorch Dataset that recursively discovers all .jpg and .png files under a root directory using glob, sorts them alphabetically for deterministic ordering, and returns individual images on demand. Each image is opened using PIL and converted to the specified color mode (default RGB).

Usage

Use for loading foreground, alpha, or background image directories in training and inference pipelines. Create one instance per directory and combine with ZipDataset for paired access.

Code Reference

Source Location

Signature

class ImagesDataset(Dataset):
    def __init__(
        self,
        root: str,
        mode: str = 'RGB',
        transforms: Optional[Callable] = None
    ):
        """
        Args:
            root: Root directory path containing images
            mode: PIL image mode ('RGB', 'L', etc.)
            transforms: Optional transform to apply to each image
        """

    def __len__(self) -> int: ...
    def __getitem__(self, idx: int) -> PIL.Image: ...

Import

from dataset import ImagesDataset

I/O Contract

Inputs

Name Type Required Description
root str Yes Directory path containing .jpg/.png images (supports nested directories)
mode str No PIL color mode (default 'RGB')
transforms callable No Transform applied to each loaded image

Outputs

Name Type Description
__getitem__ PIL.Image or Tensor Single image (or transformed result)
__len__ int Total number of discovered images

Usage Examples

Basic Usage

from dataset import ImagesDataset
from torchvision import transforms as T

# Load RGB foreground images with tensor conversion
fgr_dataset = ImagesDataset('/data/videomatte240k/train/fgr', mode='RGB', transforms=T.ToTensor())

# Load grayscale alpha images
pha_dataset = ImagesDataset('/data/videomatte240k/train/pha', mode='L', transforms=T.ToTensor())

print(len(fgr_dataset))  # Number of images found
img = fgr_dataset[0]     # First image as tensor

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment