Implementation:Fastai Fastbook NN Sequential
| Knowledge Sources | |
|---|---|
| Domains | Deep Learning, Neural Network Architecture |
| Last Updated | 2026-02-09 17:00 GMT |
Overview
Concrete tool for constructing a multi-layer neural network with non-linear activations provided by torch.nn.Sequential, torch.nn.Linear, and torch.nn.ReLU.
Description
PyTorch's nn.Sequential container chains multiple layers into a single callable module. In fastbook Chapter 4, it replaces a manually written function that performs matrix multiplications and ReLU operations with a clean, declarative definition. The canonical two-layer network for MNIST digit classification is nn.Sequential(nn.Linear(784, 30), nn.ReLU(), nn.Linear(30, 1)).
Usage
Import and use nn.Sequential when you want to:
- Define a feedforward neural network as a sequence of layers.
- Replace manually coded forward-pass functions with a composable, modular architecture.
- Leverage PyTorch's built-in parameter tracking, gradient computation, and serialization.
Code Reference
Source Location
- Repository: fastbook
- File: 04_mnist_basics.ipynb (Chapter 4), "Adding a Nonlinearity" section
Signature
# PyTorch API
torch.nn.Sequential(*layers) -> nn.Module
torch.nn.Linear(in_features, out_features, bias=True) -> nn.Module
torch.nn.ReLU() -> nn.Module
torch.nn.functional.relu(input) -> Tensor
# The canonical two-layer network from Chapter 4
simple_net = nn.Sequential(
nn.Linear(28*28, 30),
nn.ReLU(),
nn.Linear(30, 1)
)
Import
import torch.nn as nn
import torch.nn.functional as F
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| input | Tensor, shape (batch_size, 784) |
Yes | Flattened 28x28 MNIST images as a 2D tensor |
Outputs
| Name | Type | Description |
|---|---|---|
| output | Tensor, shape (batch_size, 1) |
Raw model predictions (logits) for each image in the batch |
Architecture Details
Layer-by-Layer Breakdown
| Layer | Operation | Input Shape | Output Shape | Parameters |
|---|---|---|---|---|
nn.Linear(784, 30) |
x @ W1 + b1 |
(B, 784) |
(B, 30) |
W1: (784, 30), b1: (30,) = 23,550 params |
nn.ReLU() |
max(x, 0) |
(B, 30) |
(B, 30) |
0 params |
nn.Linear(30, 1) |
x @ W2 + b2 |
(B, 30) |
(B, 1) |
W2: (30, 1), b2: (1,) = 31 params |
Total parameters: 23,581
Manual Equivalent
The nn.Sequential version is equivalent to this manual function:
# Manual version from the book
w1 = init_params((28*28, 30))
b1 = init_params(30)
w2 = init_params((30, 1))
b2 = init_params(1)
def simple_net(xb):
res = xb @ w1 + b1
res = res.max(tensor(0.0)) # ReLU
res = res @ w2 + b2
return res
Usage Examples
Basic Usage
import torch
import torch.nn as nn
from fastai.vision.all import *
# Define the neural network
simple_net = nn.Sequential(
nn.Linear(28*28, 30),
nn.ReLU(),
nn.Linear(30, 1)
)
# Inspect parameters
for name, param in simple_net.named_parameters():
print(f"{name}: {param.shape}")
# 0.weight: torch.Size([30, 784])
# 0.bias: torch.Size([30])
# 2.weight: torch.Size([1, 30])
# 2.bias: torch.Size([1])
# Forward pass on a batch of 4 flattened images
batch = torch.randn(4, 784)
output = simple_net(batch)
print(output.shape) # torch.Size([4, 1])
Training with fastai Learner
from fastai.vision.all import *
path = untar_data(URLs.MNIST_SAMPLE)
# Build DataLoaders (from previous steps)
dls = DataLoaders(dl, valid_dl)
# Create the neural network
simple_net = nn.Sequential(
nn.Linear(28*28, 30),
nn.ReLU(),
nn.Linear(30, 1)
)
# Train using fastai Learner
learn = Learner(dls, simple_net, opt_func=SGD,
loss_func=mnist_loss, metrics=batch_accuracy)
learn.fit(40, lr=0.1)
# Check accuracy
print(f"Final accuracy: {learn.recorder.values[-1][-1]:.4f}")
Using F.relu Instead of nn.ReLU()
import torch.nn.functional as F
# Functional approach (used inside a custom forward method)
class SimpleNet(nn.Module):
def __init__(self):
super().__init__()
self.linear1 = nn.Linear(28*28, 30)
self.linear2 = nn.Linear(30, 1)
def forward(self, xb):
xb = F.relu(self.linear1(xb))
return self.linear2(xb)