Implementation:Deepspeedai DeepSpeed AIO Common
| Knowledge Sources | |
|---|---|
| Domains | Async_IO, NVMe_Offload |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Core asynchronous I/O operations engine for DeepSpeed's optimizer tensor swapping to/from NVMe storage devices.
Description
This implementation provides the fundamental Linux AIO (Asynchronous I/O) functionality for DeepSpeed's tensor swapping system. It implements both sequential and overlapped I/O operation modes using the libaio library. The module handles submission of I/O control blocks (iocbs), completion tracking, performance measurement, file operations, and validation of I/O operations. It serves as the low-level I/O engine that other DeepSpeed AIO components build upon.
The implementation includes two main operation modes:
- Sequential mode: Submits a batch of I/O operations, waits for completion, then submits the next batch
- Overlap mode: Overlaps submission and completion to maintain a full queue depth for better performance
Usage
This module is used internally by DeepSpeed's I/O handle implementations when performing large-scale tensor reads/writes to NVMe devices. It's invoked during optimizer state offloading and reloading operations in zero-offload training scenarios.
Code Reference
Source Location
- Repository: DeepSpeed
- File: csrc/aio/common/deepspeed_aio_common.cpp
Signature
void do_aio_operation_sequential(const bool read_op,
std::unique_ptr<aio_context>& aio_ctxt,
std::unique_ptr<io_xfer_ctxt>& xfer_ctxt,
deepspeed_aio_config_t* config,
deepspeed_aio_perf_t* perf);
void do_aio_operation_overlap(const bool read_op,
std::unique_ptr<aio_context>& aio_ctxt,
std::unique_ptr<io_xfer_ctxt>& xfer_ctxt,
deepspeed_aio_config_t* config,
deepspeed_aio_perf_t* perf);
int open_file(const char* filename, const bool read_op);
bool validate_aio_operation(const bool read_op,
const char* filename,
void* aio_buffer,
const int64_t num_bytes);
void report_file_error(const char* filename, const std::string file_op, const int error_code);
Import
#include "deepspeed_aio_common.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| read_op | bool | Yes | True for read operations, false for write operations |
| aio_ctxt | std::unique_ptr<aio_context>& | Yes | AIO context containing queue depth, block size, and iocb arrays |
| xfer_ctxt | std::unique_ptr<io_xfer_ctxt>& | Yes | Transfer context with file descriptor, buffer, offset, and size information |
| config | deepspeed_aio_config_t* | Yes | Configuration specifying single_submit mode and other AIO parameters |
| perf | deepspeed_aio_perf_t* | No | Optional performance statistics output structure |
| filename | const char* | Yes | File path for I/O operations |
| aio_buffer | void* | Yes | Memory buffer for validation operations |
| num_bytes | int64_t | Yes | Number of bytes to transfer or validate |
Outputs
| Name | Type | Description |
|---|---|---|
| fd | int | File descriptor from open_file(), or -1 on error |
| validated | bool | True if validation passes, false otherwise |
| perf | deepspeed_aio_perf_t* | Populated with latency statistics (min/max/avg submit, complete, e2e time and rate) |
Usage Examples
// Sequential AIO operation example
std::unique_ptr<aio_context> aio_ctxt(new aio_context(block_size, queue_depth));
std::unique_ptr<io_xfer_ctxt> xfer_ctxt(
new io_xfer_ctxt(fd, file_offset, 0, num_bytes, buffer));
deepspeed_aio_config_t config(block_size, queue_depth, single_submit, false, false);
deepspeed_aio_perf_t perf;
do_aio_operation_sequential(true, aio_ctxt, xfer_ctxt, &config, &perf);
// Overlapped AIO operation for better performance
deepspeed_aio_config_t overlap_config(block_size, queue_depth, false, true, false);
do_aio_operation_overlap(false, aio_ctxt, xfer_ctxt, &overlap_config, &perf);
// Open file with O_DIRECT for AIO
int fd = open_file("/nvme/checkpoint.pt", true);
if (fd != -1) {
// Perform I/O operations
close(fd);
}
// Validate AIO operation correctness
bool valid = validate_aio_operation(true, "/nvme/checkpoint.pt", buffer, num_bytes);