Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Deepspeedai DeepSpeed Multi Accelerator Environment

From Leeroopedia


Knowledge Sources
Domains Infrastructure, Deep_Learning, Hardware_Abstraction
Last Updated 2026-02-09 00:00 GMT

Overview

Multi-accelerator support environment covering Intel XPU, Huawei NPU, Habana HPU, Cambricon MLU, Apple MPS, Tecorigin SDAA, and CPU backends.

Description

DeepSpeed provides a hardware abstraction layer through its accelerator framework (`accelerator/`), enabling the same training and inference code to run across diverse hardware backends. Each accelerator backend implements the `DeepSpeedAccelerator` abstract interface. The system auto-detects the available accelerator in a priority-ordered cascade, or the user can force a specific backend via the `DS_ACCELERATOR` environment variable. Supported backends: `cuda`, `cpu`, `xpu`, `xpu.external`, `npu`, `mps`, `hpu`, `mlu`, `sdaa`.

Usage

Use this environment when running DeepSpeed on non-NVIDIA hardware. Each backend has specific Python package requirements that must be installed before DeepSpeed can detect and use the accelerator. The detection order is: xpu.external > xpu (IPEX) > xpu (PyTorch native) > npu > sdaa > mps > hpu > mlu > cuda > cpu (fallback).

System Requirements

Category Requirement Notes
Intel XPU `intel_extension_for_pytorch` (IPEX) with XPU support Or `intel_extension_for_deepspeed` for external XPU path; PyTorch >= 2.8 supports native `torch.xpu`
Huawei NPU `torch_npu` package Ascend CANN toolkit required; `ASCEND_HOME_PATH` env var
Habana HPU `habana_frameworks.torch.hpu` Habana SynapseAI software stack required
Cambricon MLU `torch_mlu` package Cambricon Neuware SDK required
Apple MPS PyTorch with MPS support macOS with Apple Silicon; limited functionality
Tecorigin SDAA `torch_sdaa` package Tecorigin SDAA hardware and drivers required
CPU No special hardware Fallback when no accelerator detected

Dependencies

Intel XPU

  • `intel_extension_for_pytorch` (IPEX) with XPU support, OR
  • `intel_extension_for_deepspeed` (external XPU path)

Huawei NPU

  • `torch_npu`
  • Ascend CANN toolkit (version detected from `ascend_*_install.info`)

Habana HPU

  • `habana_frameworks` (SynapseAI)

Cambricon MLU

  • `torch_mlu`

Tecorigin SDAA

  • `torch_sdaa`

Credentials

  • `DS_ACCELERATOR`: Override accelerator auto-detection. Values: `cuda`, `cpu`, `xpu`, `xpu.external`, `npu`, `mps`, `hpu`, `mlu`, `sdaa`.
  • `ASCEND_HOME_PATH`: Path to Ascend CANN installation (NPU only).

Quick Install

# Force a specific accelerator
DS_ACCELERATOR=xpu pip install deepspeed

# For Intel XPU (via IPEX)
pip install intel_extension_for_pytorch
pip install deepspeed

# For Huawei NPU
pip install torch_npu
pip install deepspeed

# Verify detected accelerator
python -c "from deepspeed.accelerator import get_accelerator; print(get_accelerator().device_name())"

Code Evidence

Supported accelerator list from `accelerator/real_accelerator.py:23`:

SUPPORTED_ACCELERATOR_LIST = ['cuda', 'cpu', 'xpu', 'xpu.external', 'npu', 'mps', 'hpu', 'mlu', 'sdaa']

DS_ACCELERATOR override from `accelerator/real_accelerator.py:59-111`:

if "DS_ACCELERATOR" in os.environ.keys():
    accelerator_name = os.environ["DS_ACCELERATOR"]
    if accelerator_name == "xpu":
        import intel_extension_for_pytorch as ipex
        assert ipex._C._has_xpu()
    elif accelerator_name == "npu":
        import torch_npu
    elif accelerator_name not in SUPPORTED_ACCELERATOR_LIST:
        raise ValueError(f'DS_ACCELERATOR must be one of {SUPPORTED_ACCELERATOR_LIST}.')

Auto-detection cascade from `accelerator/real_accelerator.py:114-213`:

# Detection order:
# 1. intel_extension_for_deepspeed (xpu.external)
# 2. intel_extension_for_pytorch (xpu via IPEX)
# 3. torch.xpu (native PyTorch >= 2.8, when no CUDA devices)
# 4. torch_npu (npu)
# 5. torch_sdaa (sdaa)
# 6. torch.mps (mps)
# 7. habana_frameworks.torch.hpu (hpu)
# 8. torch_mlu (mlu)
# 9. torch.cuda (cuda)
# 10. cpu (fallback)

XPU with native PyTorch >= 2.8 from `accelerator/real_accelerator.py:139-153`:

# torch.xpu will be supported in upstream pytorch-2.8.
# Currently we can run on xpu device only using pytorch,
# also reserve the old path using ipex when the torch version is old.
if hasattr(torch, 'xpu'):
    if torch.cuda.device_count() == 0:  #ignore-cuda
        if torch.xpu.device_count() > 0 and torch.xpu.is_available():
            accelerator_name = "xpu"

Common Errors

Error Message Cause Solution
`XPU_Accelerator requires intel_extension_for_pytorch` IPEX not installed `pip install intel_extension_for_pytorch`
`NPU_Accelerator requires torch_npu` torch_npu not installed `pip install torch_npu`
`HPU_Accelerator requires habana_frameworks.torch.hpu` SynapseAI not installed Install Habana SynapseAI software stack
`MLU_Accelerator requires torch_mlu` torch_mlu not installed `pip install torch_mlu`
`SDAA_Accelerator requires torch_sdaa` torch_sdaa not installed `pip install torch_sdaa`
`MPS_Accelerator requires torch.mps` MPS not available Requires macOS with Apple Silicon and compatible PyTorch
`Setting accelerator to CPU` (warning) No accelerator detected Install appropriate hardware extension package

Compatibility Notes

  • Intel XPU: Three paths exist: external (`intel_extension_for_deepspeed`), IPEX (`intel_extension_for_pytorch`), and native PyTorch >= 2.8. XPU is only auto-detected via native torch when no CUDA devices are present.
  • Triton on ROCm: Triton import is explicitly skipped on AMD ROCm due to `pytorch-triton-rocm` module breaking the device API in DeepSpeed.
  • Apple MPS: Detection uses `torch.mps.current_allocated_memory()` as a proxy since `torch.mps.is_available()` may not exist.
  • CPU Fallback: When no accelerator is detected, DeepSpeed falls back to CPU mode with a warning. This is suitable for testing but not for production training.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment