Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Deepspeedai DeepSpeed Initialize Mesh Device

From Leeroopedia


Overview

Concrete tool for creating a multi-dimensional device mesh for sequence-parallel training provided by the DeepSpeed library.

Description

deepspeed.comm.initialize_mesh_device() creates a device mesh with named dimensions (typically "data_parallel" and "sequence_parallel"). It uses the communication backend to establish process groups for each dimension. The returned mesh_device object is passed to DeepSpeedConfig to correctly compute the effective world_size for the data-parallel dimension.

The function first asserts that the DeepSpeed communication backend is initialized. It then delegates to the backend's init_device_mesh method if supported. If the backend does not support mesh device initialization, a warning is logged and None is returned. This is typically called internally by deepspeed.initialize() when mesh_param is provided, but can also be invoked directly.

Code Reference

Signature

def initialize_mesh_device(mesh_shape: tuple, mesh_dim_names: tuple) -> Optional[object]

Import

from deepspeed.comm import initialize_mesh_device
# Or invoked implicitly via:
# deepspeed.initialize(mesh_param=(dp_size, sp_size))

I/O Contract

Inputs

Parameter Type Required Description
mesh_shape tuple Yes Shape of the mesh, e.g. (dp_size, sp_size)
mesh_dim_names tuple Yes Names for each dimension, e.g. ("data_parallel", "sequence_parallel")

Outputs

Output Type Description
mesh_device object or None The mesh device object, or None if the backend does not support mesh initialization

Usage Example

import deepspeed

# 8 GPUs: 2 data-parallel groups x 4 sequence-parallel within each group
engine, _, _, _ = deepspeed.initialize(
    model=model,
    config=ds_config,
    mesh_param=(2, 4)  # (dp_size, sp_size)
)

# Or call directly:
from deepspeed.comm import initialize_mesh_device
mesh_device = initialize_mesh_device(
    mesh_shape=(2, 4),
    mesh_dim_names=("data_parallel", "sequence_parallel")
)

Related Pages

Knowledge Sources

Last updated: 2026-02-09 00:00 GMT

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment