Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Fastai Fastbook Transfer Learning

From Leeroopedia


Knowledge Sources
Domains Computer_Vision, Deep_Learning, Transfer_Learning
Last Updated 2026-02-09 17:00 GMT

Overview

Transfer learning is the technique of reusing a model trained on a large general dataset as the starting point for a model on a different, typically smaller, task-specific dataset.

Description

Training a deep convolutional neural network from scratch requires millions of labeled images and days of compute time. Transfer learning circumvents this by starting with a model that has already learned to extract useful visual features from a large dataset (typically ImageNet, with 1.2 million images across 1,000 categories). The key insight is that early layers of a CNN learn general features (edges, textures, shapes) that are useful for virtually any visual task, while later layers learn increasingly task-specific features.

Transfer learning for image classification works in three steps:

  1. Load a pretrained model -- take a network (e.g., ResNet34) with weights trained on ImageNet.
  2. Replace the head -- remove the final classification layer (which outputs 1,000 ImageNet classes) and replace it with a new layer that outputs the number of classes in the target task.
  3. Train the head -- freeze the pretrained body and train only the new head on the target dataset, then optionally unfreeze and fine-tune the entire network.

This approach routinely achieves high accuracy with as few as a hundred images per class and trains in minutes rather than days.

Usage

Use transfer learning whenever you are training an image classifier and a pretrained model is available for your input modality. It is the default and recommended approach in fastai for all image classification tasks. The only scenario where training from scratch might be preferred is when the target domain is radically different from natural images (e.g., spectrograms, medical scans with no visual similarity to photographs), and even then transfer learning often helps.

Theoretical Basis

Feature Hierarchy in CNNs

Research by Zeiler and Fergus (2013) and Yosinski et al. (2014) demonstrated that CNN layers learn hierarchical features:

Layer Depth Features Learned Transferability
Early layers (1-2) Edges, gradients, colors Highly general; transfer to almost any visual task
Middle layers (3-4) Textures, patterns, parts Moderately general; useful for most tasks
Late layers (5+) Object parts, semantic concepts Task-specific; may need retraining
Classification head Category probabilities Entirely task-specific; must be replaced

Body-Head Architecture

In the transfer learning paradigm, the network is conceptually divided into:

  • Body (backbone): All convolutional layers from the pretrained model. These encode the feature extraction pipeline.
  • Head: One or more fully connected layers that map the body's feature vector to the target class probabilities.

The head for a new task is typically:

AdaptiveAvgPool2d -> Flatten -> BatchNorm1d -> Dropout -> Linear -> ReLU -> BatchNorm1d -> Dropout -> Linear(num_classes)

Why Freezing Works

When the body is frozen (gradients not computed for body parameters), only the randomly initialized head is updated. Because the body already produces meaningful feature vectors, the head can learn a good linear decision boundary in just one or two epochs. This prevents the large random gradients from the untrained head from corrupting the carefully learned body weights during the initial training phase.

Common Pretrained Architectures

Architecture Parameters Top-1 Accuracy (ImageNet) Recommended Use
ResNet18 11.7M 69.8% Quick experiments, small datasets
ResNet34 21.8M 73.3% Default starting point in fastai
ResNet50 25.6M 76.1% Better accuracy when GPU memory allows
ResNet101 44.5M 77.4% Large datasets, fine-grained tasks

Related Pages

Implemented By

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment