Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Fastai Fastbook DataBlock Dataloaders

From Leeroopedia


Knowledge Sources
Domains Computer_Vision, Data_Engineering
Last Updated 2026-02-09 17:00 GMT

Overview

Concrete tool for materializing a DataBlock blueprint into training and validation data streams provided by fastai.data.block.DataBlock.dataloaders.

Description

The dataloaders method on a DataBlock object takes a data source path and returns a DataLoaders instance containing a .train and .valid DataLoader. Each DataLoader yields mini-batches of tensors ready for model consumption. The companion methods show_batch and summary enable visual inspection and diagnostic tracing of the data pipeline.

Usage

Call .dataloaders(path) immediately after defining a DataBlock. Use show_batch to visually verify correctness before training. Use summary when the pipeline throws an error to identify which step failed.

Code Reference

Source Location

  • Repository: fastbook
  • File: translations/cn/02_production.md (lines 323-396)

Signature

# Materialize blueprint into DataLoaders
DataBlock.dataloaders(source, path='.', verbose=False, bs=64)

# Visual inspection of a batch
DataLoaders.show_batch(max_n=4, nrows=None, ncols=None, figsize=None, unique=False)

# Diagnostic trace for debugging
DataBlock.summary(source, bs=64)

Import

from fastai.vision.all import (
    DataBlock, ImageBlock, CategoryBlock,
    get_image_files, RandomSplitter, parent_label,
    Resize, aug_transforms
)

I/O Contract

Inputs

Name Type Required Description
source Path or str Yes Path to the root directory containing the image data
path Path or str No Base path for relative paths (default: '.')
verbose bool No If True, print pipeline details during creation (default: False)
bs int No Batch size for both training and validation loaders (default: 64)
max_n int No Maximum number of items to display in show_batch (default: 4)

Outputs

Name Type Description
dls DataLoaders Object with .train and .valid DataLoader attributes yielding (xb, yb) tensor batches
show_batch output matplotlib figure Grid of images with their labels for visual inspection
summary output printed text Step-by-step diagnostic trace of the pipeline for a single item

Usage Examples

Basic Usage: Create DataLoaders and Inspect

from fastai.vision.all import *
from pathlib import Path

path = Path('bears')

bears = DataBlock(
    blocks=(ImageBlock, CategoryBlock),
    get_items=get_image_files,
    splitter=RandomSplitter(valid_pct=0.2, seed=42),
    get_y=parent_label,
    item_tfms=Resize(128)
)

# Materialize the blueprint into DataLoaders
dls = bears.dataloaders(path, bs=32)

# Visually inspect a batch of training images with labels
dls.show_batch(max_n=8, nrows=2, ncols=4)

# Check dataset sizes
print(f'Training items: {len(dls.train.dataset)}')
print(f'Validation items: {len(dls.valid.dataset)}')
print(f'Categories: {dls.vocab}')

Debugging with summary

from fastai.vision.all import *

path = untar_data(URLs.PETS) / 'images'

pets = DataBlock(
    blocks=(ImageBlock, CategoryBlock),
    get_items=get_image_files,
    splitter=RandomSplitter(valid_pct=0.2, seed=42),
    get_y=using_attr(RegexLabeller(r'^(.+)_\d+.jpg$'), 'name'),
    item_tfms=Resize(460),
    batch_tfms=aug_transforms(size=224, min_scale=0.75)
)

# If dataloaders() fails, use summary to diagnose:
pets.summary(path)
# Prints step-by-step trace:
#   Setting-up type transforms pipelines
#   Collecting items from ...
#   Found 7390 items
#   2 datasets of sizes 5912,1478
#   Setting up Pipeline: ...

Accessing Individual DataLoaders

dls = pets.dataloaders(path, bs=64)

# Access training and validation loaders separately
train_dl = dls.train
valid_dl = dls.valid

# Iterate over one batch
xb, yb = first(train_dl)
print(f'Batch shape: {xb.shape}')   # torch.Size([64, 3, 224, 224])
print(f'Labels shape: {yb.shape}')  # torch.Size([64])
print(f'Label values: {yb[:5]}')

# Show a validation batch (no augmentation applied)
dls.valid.show_batch(max_n=4)

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment