Environment:Deepspeedai DeepSpeed NVMe Environment
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Storage, NVMe_Offload |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
NVMe storage environment for DeepSpeed's asynchronous I/O subsystem, enabling optimizer state and parameter offloading to fast NVMe SSDs.
Description
This environment provides the NVMe storage backend required by DeepSpeed's async I/O (AIO) subsystem. The AIO system uses Linux's native asynchronous I/O interface (libaio) to perform high-throughput tensor reads and writes to NVMe devices. This is essential for ZeRO-Infinity and ZeRO-Offload scenarios where optimizer states, gradients, or parameters are swapped between GPU/CPU memory and NVMe storage.
The environment requires Linux with the `libaio` library installed, and an NVMe SSD mounted with appropriate permissions. The AIO subsystem validates the storage device's capabilities at initialization time, including checking for direct I/O support and measuring achievable bandwidth.
Usage
Use this environment when training models with ZeRO-Infinity (Stage 3 with NVMe offload) or when using DeepSpeed's parameter/optimizer state swapping to NVMe. Required for any workflow that sets `offload_optimizer.device: "nvme"` or `offload_param.device: "nvme"` in the DeepSpeed configuration.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Linux | Windows not supported for AIO |
| Storage | NVMe SSD | Mounted with read/write permissions; direct I/O support recommended |
| Library | `libaio` (libaio-dev) | Linux asynchronous I/O library; required for kernel AIO operations |
| Filesystem | ext4, xfs recommended | Must support O_DIRECT for optimal performance |
| Permissions | Read/write access to NVMe mount point | User must have permissions on the offload directory |
Dependencies
System Packages
- `libaio-dev` (Debian/Ubuntu) or `libaio-devel` (RHEL/CentOS) - Linux AIO library headers
- `libaio1` (runtime library)
Python Packages
- `torch` (with CPU or CUDA support)
- `deepspeed` (with AIO op builder compiled)
Credentials
The following environment variables affect NVMe I/O behavior:
- `DLTS_HOSTFILE`: Used in distributed settings to coordinate NVMe paths across nodes
- DeepSpeed config `aio` section controls: `block_size`, `queue_depth`, `thread_count`, `single_submit`, `overlap_events`
Quick Install
# Install libaio development headers
sudo apt-get install libaio-dev # Debian/Ubuntu
# sudo yum install libaio-devel # RHEL/CentOS
# Install DeepSpeed (AIO ops are JIT compiled on first use)
pip install deepspeed
# Verify AIO support
ds_report | grep aio
Code Evidence
AIO operation modes from `csrc/aio/common/deepspeed_aio_common.cpp`:
// Sequential I/O: submit batch, wait for completion, repeat
static int _do_io_sequential(const long long int n_iocbs, struct iocb** iocbs,
io_context_t aio_ctxt, int n_completions) {
// Submit all iocbs then wait for all completions
}
// Overlap I/O: maintain full queue depth by overlapping submit and complete
static int _do_io_overlap(const long long int n_iocbs, struct iocb** iocbs,
io_context_t aio_ctxt, int n_completions) {
// Submit initial batch, then overlap completion tracking with new submissions
}
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `libaio not found` | libaio-dev not installed | `sudo apt-get install libaio-dev` |
| `AIO op builder failed` | Missing libaio headers or incompatible compiler | Install libaio-dev and ensure gcc/g++ is available |
| `Permission denied on NVMe path` | Insufficient permissions on offload directory | Check mount permissions and user access |
| `O_DIRECT not supported` | Filesystem does not support direct I/O | Use ext4 or xfs filesystem; check mount options |
Related Pages
- Implementation:Deepspeedai_DeepSpeed_AIO_Common
- Implementation:Deepspeedai_DeepSpeed_AIO_Bench_Perf_Sweep
- Implementation:Deepspeedai_DeepSpeed_AIO_Utils
- Implementation:Deepspeedai_DeepSpeed_CPU_IO_Op
- Implementation:Deepspeedai_DeepSpeed_GDS_Op
- Implementation:Deepspeedai_DeepSpeed_IO_Handle
- Implementation:Deepspeedai_DeepSpeed_IO_Handle_Interface
- Implementation:Deepspeedai_DeepSpeed_Pin_Tensor
- Implementation:Deepspeedai_DeepSpeed_Py_AIO
- Implementation:Deepspeedai_DeepSpeed_Py_Copy
- Implementation:Deepspeedai_DeepSpeed_Py_DS_AIO_Module