Implementation:PeterL1n BackgroundMattingV2 ImagesDataset
| Knowledge Sources | |
|---|---|
| Domains | Data_Loading, Computer_Vision |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete tool for loading images from a directory tree provided by dataset/images.py.
Description
ImagesDataset is a PyTorch Dataset that recursively discovers all .jpg and .png files under a root directory using glob, sorts them alphabetically for deterministic ordering, and returns individual images on demand. Each image is opened using PIL and converted to the specified color mode (default RGB).
Usage
Use for loading foreground, alpha, or background image directories in training and inference pipelines. Create one instance per directory and combine with ZipDataset for paired access.
Code Reference
Source Location
- Repository: BackgroundMattingV2
- File: dataset/images.py
- Lines: 6-23
Signature
class ImagesDataset(Dataset):
def __init__(
self,
root: str,
mode: str = 'RGB',
transforms: Optional[Callable] = None
):
"""
Args:
root: Root directory path containing images
mode: PIL image mode ('RGB', 'L', etc.)
transforms: Optional transform to apply to each image
"""
def __len__(self) -> int: ...
def __getitem__(self, idx: int) -> PIL.Image: ...
Import
from dataset import ImagesDataset
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| root | str | Yes | Directory path containing .jpg/.png images (supports nested directories) |
| mode | str | No | PIL color mode (default 'RGB') |
| transforms | callable | No | Transform applied to each loaded image |
Outputs
| Name | Type | Description |
|---|---|---|
| __getitem__ | PIL.Image or Tensor | Single image (or transformed result) |
| __len__ | int | Total number of discovered images |
Usage Examples
Basic Usage
from dataset import ImagesDataset
from torchvision import transforms as T
# Load RGB foreground images with tensor conversion
fgr_dataset = ImagesDataset('/data/videomatte240k/train/fgr', mode='RGB', transforms=T.ToTensor())
# Load grayscale alpha images
pha_dataset = ImagesDataset('/data/videomatte240k/train/pha', mode='L', transforms=T.ToTensor())
print(len(fgr_dataset)) # Number of images found
img = fgr_dataset[0] # First image as tensor