Principle:Fastai Fastbook Model Export
| Knowledge Sources | |
|---|---|
| Domains | Deep_Learning, MLOps, Computer_Vision |
| Last Updated | 2026-02-09 17:00 GMT |
Overview
Model export is the process of serializing a trained model and its associated data pipeline into a self-contained artifact that can be loaded in a production environment for inference without any training code.
Description
Training a model is only half the journey. To be useful, the model must be deployed where it can make predictions on new data. Model export bridges the gap between the training environment (typically a GPU-equipped Jupyter notebook) and the production environment (a web server, mobile device, or batch processing pipeline).
A proper export must capture three components:
- Model architecture: The structure of the neural network (layers, dimensions, activation functions).
- Trained weights: The learned parameter values.
- Data pipeline: The transforms needed to convert a raw input (e.g., a JPEG file) into the tensor format the model expects, and to convert the model's output tensor back into a human-readable prediction (e.g., a class name).
Without the data pipeline, a deployment engineer would need to manually replicate every preprocessing step (resizing, normalization, etc.) exactly as it was during training -- a fragile and error-prone process.
Usage
Export the model after training and interpretation are complete and you are satisfied with its performance. The exported artifact is the deliverable for deployment. In production, load the artifact and call the predict method on new inputs.
Theoretical Basis
Serialization via Pickle
Python's pickle protocol serializes arbitrary Python objects into a byte stream. For deep learning models, this captures:
- The class hierarchy and method definitions (by reference to the module/class name)
- The tensor data (model weights) as raw bytes
- The transform pipeline objects and their parameters
The resulting file is self-contained: loading it reconstructs the entire Learner without needing the training script or dataset.
The Predict Pipeline
When predict is called on a single item in production, it executes the following steps:
1. Load the raw input (e.g., open an image file or accept a PIL Image)
2. Apply the item transforms (resize, normalize) -- same as during training
3. Create a mini-batch of size 1
4. Run the forward pass through the model
5. Apply the decoding pipeline to convert output tensor to human-readable form
6. Return (predicted_class, class_index_tensor, probabilities_tensor)
Prediction Output Format
The predict method returns a 3-tuple:
| Element | Type | Description |
|---|---|---|
| Predicted class | str | The human-readable class name (e.g., "grizzly") |
| Class index | tensor | A single-element tensor with the integer index of the predicted class |
| Probabilities | tensor | A 1D tensor of probabilities for all classes (sums to 1.0) |
Deployment Considerations
| Concern | Guidance |
|---|---|
| File size | A ResNet34 export is ~85 MB; ResNet50 is ~100 MB. Consider model compression for mobile deployment. |
| CPU vs. GPU | load_learner can load to CPU by default, making it suitable for servers without GPUs. |
| Python version | The pickle format is sensitive to Python version. Use the same major version for export and deployment. |
| fastai version | The deployed environment must have the same fastai version as the training environment to ensure transform compatibility. |
| Security | Pickle files can execute arbitrary code. Only load exports from trusted sources. |