Environment:Sgl project Sglang Multi Platform Accelerators

Knowledge Sources	SGLang AMD ROCm Intel XPU Ascend NPU
Domains	Infrastructure, Multi_Platform
Last Updated	2026-02-10 00:00 GMT

Overview

SGLang supports multiple hardware accelerator platforms beyond NVIDIA CUDA: AMD ROCm (HIP), Intel XPU, Huawei Ascend NPU, Moore Threads MUSA, Habana HPU, and Intel CPU (AMX). Each platform requires its own driver stack, PyTorch variant, and platform-specific kernel library.

Description

SGLang abstracts hardware differences through a unified device detection layer in `python/sglang/srt/utils/common.py`. At startup, the runtime probes for available accelerators using platform-specific APIs (`torch.cuda`, `torch.xpu`, `torch.npu`, `torch.hpu`). Each platform has its own build configuration (`pyproject_*.toml`), attention backends, and kernel implementations. AMD ROCm uses the HIP interface through PyTorch's CUDA compatibility layer. Intel XPU requires PVC/LNL/BMG GPUs with XMX (matrix extension) support. Ascend NPU requires the CANN toolkit and `torch_npu`. Moore Threads MUSA requires `torchada`. CPU inference requires Intel AMX tile support on x86 or ARM64.

Usage

Use the appropriate platform environment when deploying SGLang on non-NVIDIA hardware. Each platform has restricted feature availability compared to CUDA; consult platform documentation for supported models and attention backends.

System Requirements

Platform	Hardware	Driver/Toolkit	PyTorch Package	Detection Tool
AMD ROCm	MI250X/MI300X	ROCm 6.0+	torch (ROCm build)	`rocm-smi`
Intel XPU	PVC/LNL/BMG	oneAPI	torch (XPU build)	Intel GPU tools
Ascend NPU	Atlas 800/910	CANN 8.0+	`torch_npu`	`npu-smi`
Moore Threads	MTT S4000	MUSA SDK	`torchada`	`mthreads-gmi`
Habana HPU	Gaudi2/3	Habana SynapseAI	`habana_frameworks`	`hl-smi`
Intel CPU	Xeon (AMX)	—	torch (CPU)	`SGLANG_USE_CPU_ENGINE=1`

Dependencies

AMD ROCm

`torch` (ROCm build from pytorch.org)
`sgl-kernel` (ROCm build)
`aiter` (AMD-specific kernel library, optional)
ROCm driver and `rocm-smi`

Ascend NPU

`torch_npu` (Ascend PyTorch adapter)
`sgl-kernel-npu`
CANN toolkit (`ASCEND_TOOLKIT_HOME` or `ASCEND_INSTALL_PATH`)
`torchair` (for torch.compile on NPU)

Intel XPU

`torch` (XPU build via Intel Extension for PyTorch)
`sgl-kernel` (XPU build)
F64 support required (PVC/LNL/BMG only)

Intel CPU

`sgl-kernel` with CPU backend (`convert_weight_packed` op)
Intel AMX tile support (`torch._C._cpu._is_amx_tile_supported()`)
Set `SGLANG_USE_CPU_ENGINE=1`

Moore Threads MUSA

`torchada` package
MUSA SDK and `mthreads-gmi`

Habana HPU

`habana_frameworks.torch.hpu`
`hl-smi`

Credentials

`ASCEND_TOOLKIT_HOME` or `ASCEND_INSTALL_PATH`: Path to CANN toolkit (NPU only)
`ASCEND_NPU_PHY_ID`: Physical NPU device ID (default: -1, auto-detect)
`SGLANG_USE_CPU_ENGINE`: Set to `1` to enable CPU backend

Quick Install

# AMD ROCm
pip install sglang --find-links https://flashinfer.ai/whl/rocm/

# Ascend NPU
pip install sglang-npu

# Intel XPU
pip install sglang-xpu

# CPU only
pip install sglang-cpu
SGLANG_USE_CPU_ENGINE=1 python -m sglang.launch_server --model meta-llama/Meta-Llama-3-8B

Code Evidence

Platform detection from `python/sglang/srt/utils/common.py:108-195`:

@lru_cache(maxsize=1)
def is_hip() -> bool:
    return torch.version.hip is not None

@lru_cache(maxsize=1)
def is_hpu() -> bool:
    return hasattr(torch, "hpu") and torch.hpu.is_available()

@lru_cache(maxsize=1)
def is_xpu() -> bool:
    return hasattr(torch, "xpu") and torch.xpu.is_available()

@lru_cache(maxsize=1)
def is_npu() -> bool:
    if not hasattr(torch, "npu"):
        return False
    if not torch.npu.is_available():
        raise RuntimeError("torch_npu detected, but NPU device is not available or visible.")
    return True

@lru_cache(maxsize=1)
def is_cpu() -> bool:
    is_host_cpu_supported = is_host_cpu_x86() or is_host_cpu_arm64()
    return os.getenv("SGLANG_USE_CPU_ENGINE", "0") == "1" and is_host_cpu_supported

@lru_cache(maxsize=1)
def is_musa() -> bool:
    try:
        import torchada
    except ImportError:
        return False
    return hasattr(torch.version, "musa") and torch.version.musa is not None

HIP-specific FP8 handling from `python/sglang/srt/utils/common.py:113-117`:

if is_hip():
    HIP_FP8_E4M3_FNUZ_MAX = 224.0
    FP8_E4M3_MAX = HIP_FP8_E4M3_FNUZ_MAX
else:
    FP8_E4M3_MAX = torch.finfo(torch.float8_e4m3fn).max

NPU memory query from `python/sglang/srt/utils/common.py:1782-1788`:

def get_npu_memory_capacity():
    try:
        import torch_npu
        return torch.npu.mem_get_info()[1] // 1024 // 1024
    except ImportError as e:
        raise ImportError("torch_npu is required when run on npu device.")

Common Errors

Error Message	Cause	Solution
`torch_npu detected, but NPU device is not available or visible`	NPU driver issue	Check CANN toolkit installation and `npu-smi`
`torch_npu is required when run on npu device`	torch_npu not installed	`pip install torch_npu`
`rocm-smi not found`	ROCm drivers missing	Install ROCm 6.0+ drivers
`mthreads-gmi not found`	Moore Threads drivers missing	Install MUSA SDK
`hl-smi not found`	Habana drivers missing	Install SynapseAI runtime
`aiter is AMD specific kernel library`	aiter not installed on AMD	`pip install aiter` on AMD ROCm system
`NPU detected, but torchair package is not installed`	torchair missing for torch.compile	`pip install torchair`
`No accelerator (CUDA, XPU, HPU, NPU, MUSA) is available`	No supported hardware detected	Install appropriate driver and PyTorch variant

Compatibility Notes

AMD ROCm: Uses HIP through PyTorch's CUDA compatibility layer. FP8 uses `E4M3_FNUZ` format (max 224.0) instead of CUDA's `E4M3FN`. Attention backends: `triton`, `aiter`, `wave`.
Intel XPU: Requires PVC/LNL/BMG GPUs with F64 support for XMX acceleration. Attention backend: `intel_xpu`.
Ascend NPU: Requires CANN toolkit. Supports env vars for multi-stream (`SGLANG_NPU_USE_MULTI_STREAM`) and MLAPo (`SGLANG_NPU_USE_MLAPO`). Attention backend: `ascend`.
Intel CPU: Requires Intel AMX tile support. Dimension constraints: output channels % 16 == 0, input channels % 32 == 0. Backend: `intel_amx`.
Moore Threads MUSA: Early support. Requires `torchada` package.
Habana HPU: Requires `habana_frameworks.torch.hpu` import.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment