Implementation:Fastai Fastbook Tensor Data Pipeline
| Knowledge Sources | |
|---|---|
| Domains | Deep Learning, Data Representation, Computer Vision |
| Last Updated | 2026-02-09 17:00 GMT |
Overview
Concrete pattern for converting raw image files into normalized rank-3 tensors suitable for neural network training, using PIL, NumPy, and PyTorch.
Description
This pipeline loads MNIST digit images from disk, converts each into a PyTorch tensor, stacks all tensors for a given class into a single rank-3 tensor, converts to float, normalizes to [0, 1], and computes the per-class mean image. The result is a pair of rank-3 tensors (one per digit class) ready for downstream classification, plus mean "ideal" images that visualize each class.
Usage
Use this pattern at the start of any image classification project when working with individual image files that need to be assembled into batched tensors. It is especially useful for small-to-medium datasets like MNIST where all images fit in memory.
Code Reference
Source Location
- Repository: fastbook
- File: 04_mnist_basics.ipynb (Chapter 4), lines corresponding to the "Pixel Similarity" section
Signature
The pipeline follows this sequence of function calls:
# Step 1: Load a single image
img = Image.open(path)
# Step 2: Convert to tensor
img_tensor = tensor(img) # shape: (28, 28), dtype: uint8
# Step 3: Build list of tensors per class
three_tensors = [tensor(Image.open(o)) for o in threes]
seven_tensors = [tensor(Image.open(o)) for o in sevens]
# Step 4: Stack into rank-3 tensor, convert to float, normalize
stacked_threes = torch.stack(three_tensors).float() / 255
# shape: (N_3, 28, 28), dtype: float32, range: [0, 1]
stacked_sevens = torch.stack(seven_tensors).float() / 255
# shape: (N_7, 28, 28), dtype: float32, range: [0, 1]
# Step 5: Compute mean image per class
mean3 = stacked_threes.mean(0) # shape: (28, 28)
mean7 = stacked_sevens.mean(0) # shape: (28, 28)
Import
from PIL import Image
from numpy import array
from torch import tensor
import torch
from fastai.vision.all import show_image
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| path | Path or str |
Yes | File path to a single MNIST digit image (28x28 grayscale PNG) |
| threes | list[Path] |
Yes | List of file paths to all digit-3 images in the training set |
| sevens | list[Path] |
Yes | List of file paths to all digit-7 images in the training set |
Outputs
| Name | Type | Description |
|---|---|---|
| stacked_threes | Tensor, shape (N_3, 28, 28), float32 |
Normalized rank-3 tensor of all 3-digit images, values in [0, 1] |
| stacked_sevens | Tensor, shape (N_7, 28, 28), float32 |
Normalized rank-3 tensor of all 7-digit images, values in [0, 1] |
| mean3 | Tensor, shape (28, 28), float32 |
Per-pixel mean of all 3-digit images (the "ideal" 3) |
| mean7 | Tensor, shape (28, 28), float32 |
Per-pixel mean of all 7-digit images (the "ideal" 7) |
Usage Examples
Basic Usage
from fastai.vision.all import *
# Get MNIST dataset paths
path = untar_data(URLs.MNIST_SAMPLE)
threes = (path/'train'/'3').ls().sorted()
sevens = (path/'train'/'7').ls().sorted()
# Build rank-3 tensors
seven_tensors = [tensor(Image.open(o)) for o in sevens]
three_tensors = [tensor(Image.open(o)) for o in threes]
stacked_threes = torch.stack(three_tensors).float() / 255
stacked_sevens = torch.stack(seven_tensors).float() / 255
# Verify shape and rank
print(stacked_threes.shape) # torch.Size([6131, 28, 28])
print(stacked_threes.ndim) # 3
# Compute mean images
mean3 = stacked_threes.mean(0)
mean7 = stacked_sevens.mean(0)
# Visualize the ideal digit
show_image(mean3)
Inspecting Individual Pixels
# View a 6x6 slice of an image as a NumPy array
im3 = Image.open(threes[0])
print(array(im3)[4:10, 4:10])
# array([[ 0, 0, 0, 0, 0, 0],
# [ 0, 0, 0, 0, 0, 29],
# [ 0, 0, 0, 48, 166, 224],
# [ 0, 93, 244, 249, 253, 187],
# [ 0, 107, 253, 253, 230, 48],
# [ 0, 3, 20, 20, 15, 0]], dtype=uint8)
# Same slice as a PyTorch tensor
print(tensor(im3)[4:10, 4:10])