Implementation:Microsoft DeepSpeedExamples Fast Torch Serialization

Knowledge Sources	Microsoft_DeepSpeedExamples
Domains	Deep Learning, Checkpointing
Last Updated	2026-02-07 12:00 GMT

Overview

Patched version of PyTorch 2.6.0 serialization module with FastPersist optimizations for accelerated model checkpoint writing via DeepNVMe.

Description

serialization_fast_v2.6.0.py is a modified copy of PyTorch's torch.serialization module that enables DeepSpeed FastPersist integration for high-throughput NVMe and GDS (GPU Direct Storage) writes during model checkpointing. The module provides the standard save() and load() functions used by PyTorch for tensor serialization, along with supporting utilities for endianness control, CRC32 options, memory-mapped loading, and safe globals management.

The key difference from the original PyTorch serialization is in the save() function's storage writing path. When the file object has a save_torch_storage_object_list method (indicating a FastFileWriter handle), the module batches all storage objects together and writes them in a single optimized call rather than writing each storage individually. This batched approach enables the DeepNVMe backend to perform direct NVMe writes with optimal throughput, achieving 25X+ speedup over standard filesystem writes.

The module also provides thread-local state management via _SerializationLocal for map_location propagation, skip_data support for metadata-only saves, and fake tensor materialization. It supports both the legacy pickle-based format and the modern zipfile-based serialization format introduced in PyTorch 1.6.

Usage

This module is used as a drop-in replacement for torch.serialization when FastPersist checkpointing is enabled. It is transparently swapped in by the DeepNVMe model checkpoint infrastructure to accelerate checkpoint saves without requiring changes to user code that calls torch.save().

Code Reference

Source Location

Repository: Microsoft_DeepSpeedExamples
File: deepnvme/model_checkpoint/torch/serialization_fast_v2.6.0.py
Lines: 1-1979

Signature

def save(
    obj: object,
    f: FILE_LIKE,
    pickle_module: Any = pickle,
    pickle_protocol: int = DEFAULT_PROTOCOL,
    _use_new_zipfile_serialization: bool = True,
    _disable_byteorder_record: bool = False,
) -> None:
    ...

def load(
    f: FILE_LIKE,
    map_location: MAP_LOCATION = None,
    pickle_module: Any = None,
    *,
    weights_only: Optional[bool] = None,
    mmap: Optional[bool] = None,
    **pickle_load_args: Any,
) -> Any:
    ...

Import

from deepnvme.model_checkpoint.torch.serialization_fast_v2_6_0 import save, load

I/O Contract

Inputs

Name	Type	Required	Description
obj	object	Yes	The Python object to serialize (typically a model state_dict or tensor)
f	FILE_LIKE	Yes	File path (str/PathLike) or file-like object with write capability; may be a FastFileWriter for NVMe acceleration
pickle_module	Any	No	Module used for pickling metadata (default: pickle)
pickle_protocol	int	No	Protocol version for pickle (default: 2)
_use_new_zipfile_serialization	bool	No	Use zipfile-based format (default: True)

Outputs

Name	Type	Description
(save)	None	Writes serialized object to the specified file
(load)	Any	The deserialized Python object (typically dict, Tensor, or Module state)

Usage Examples

Saving a Model Checkpoint with FastPersist

import torch
from deepnvme.model_checkpoint.torch import serialization_fast_v2_6_0 as fast_serial

# Standard usage - transparent acceleration when f supports FastFileWriter
model_state = model.state_dict()
fast_serial.save(model_state, "/mnt/nvme/checkpoint.pt")

# Loading remains standard
state_dict = fast_serial.load("/mnt/nvme/checkpoint.pt", map_location="cpu")
model.load_state_dict(state_dict)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment