Environment:Microsoft DeepSpeedExamples CIFAR10 Training Environment
| Knowledge Sources | |
|---|---|
| Domains | Deep_Learning, Computer_Vision, Getting_Started |
| Last Updated | 2026-02-07 13:00 GMT |
Overview
Minimal Python environment with PyTorch, torchvision, and DeepSpeed for training a simple CNN on CIFAR-10, supporting both CPU and GPU execution.
Description
This environment provides the minimal dependencies to run the CIFAR-10 getting started example, which demonstrates DeepSpeed integration with a basic convolutional neural network. It supports CPU-only execution as a fallback, making it the most lightweight environment in the repository. The example covers DeepSpeed ZeRO stages 0-3, mixed precision (fp16/bf16), and optionally Mixture of Experts (MoE) and PR-MoE layers.
Usage
Use this environment for learning DeepSpeed fundamentals with the CIFAR-10 tutorial. It is the mandatory prerequisite for the Net_Tutorial, Add_Argument_CIFAR, DeepSpeed_Initialize_CIFAR, Net_DeepSpeed, and Test_Function_CIFAR implementations.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Linux, macOS, or Windows | Cross-platform support |
| Hardware | Any CPU or NVIDIA GPU | GPU optional; code falls back to CPU |
| Disk | 200MB | For CIFAR-10 dataset download |
Dependencies
Python Packages
- `torch` (PyTorch)
- `torchvision` == 0.4.0
- `pillow` >= 7.1.0
- `matplotlib`
- `deepspeed` (for DeepSpeed-enabled variant)
Credentials
No credentials required. CIFAR-10 dataset is downloaded automatically from public sources.
Quick Install
# Install all required packages
pip install torch torchvision pillow matplotlib deepspeed
Code Evidence
Requirements from `training/cifar/requirements.txt`:
torchvision==0.4.0
pillow>=7.1.0
matplotlib
Device detection from `training/cifar/cifar10_deepspeed.py:10,284`:
from deepspeed.accelerator import get_accelerator
# ...
get_accelerator().set_device(_local_rank)
# ...
local_device = get_accelerator().device_name(model_engine.local_rank)
CPU fallback from `training/cifar/cifar10_tutorial.py`:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `BrokenPipeError` | Windows DataLoader with num_workers > 0 | Set `num_workers=0` in DataLoader on Windows |
| `RuntimeError: CUDA error` | No CUDA-capable device | Run on CPU or install CUDA drivers |
| `ModuleNotFoundError: deepspeed` | DeepSpeed not installed | `pip install deepspeed` (for DeepSpeed variant only) |
Compatibility Notes
- CPU Training: Fully supported via PyTorch CPU fallback; useful for testing without GPU
- Windows: Set `num_workers=0` in DataLoader to avoid BrokenPipeError
- MoE/PR-MoE: Requires multi-GPU setup for proper expert parallelism
- torchvision pinned: The requirements file pins torchvision==0.4.0; newer versions are likely compatible
Related Pages
- Implementation:Microsoft_DeepSpeedExamples_Net_Tutorial
- Implementation:Microsoft_DeepSpeedExamples_Add_Argument_CIFAR
- Implementation:Microsoft_DeepSpeedExamples_DeepSpeed_Initialize_CIFAR
- Implementation:Microsoft_DeepSpeedExamples_Net_DeepSpeed
- Implementation:Microsoft_DeepSpeedExamples_Test_Function_CIFAR