Environment:Vllm project Vllm Intel XPU

Knowledge Sources	vllm Intel Extension for PyTorch
Domains	GPU_Computing, Intel_XPU
Last Updated	2026-02-08 00:00 GMT

Overview

Intel XPU (GPU) runtime environment for vLLM, leveraging Intel Extension for PyTorch (IPEX) to enable LLM inference on Intel discrete GPUs (Data Center GPU Max series / Ponte Vecchio and future Arc-based accelerators).

Description

This environment defines the software stack required to run vLLM on Intel discrete GPUs using the XPU device abstraction. Intel's XPU platform uses the oneAPI programming model and the Level Zero runtime for GPU compute. The Intel Extension for PyTorch (IPEX) provides the bridge between PyTorch's device abstraction and Intel GPU hardware, implementing custom operators for attention, GEMM, and element-wise operations optimized for Intel's Xe GPU architecture. vLLM's IPEX operations module (ipex_ops) wraps IPEX-specific kernel calls for paged attention, rotary positional embeddings, layer normalization, and activation functions. The XPU backend registers as a vLLM platform via the platform detection system, enabling automatic selection when Intel GPUs are detected.

Usage

To use vLLM with Intel XPU, install the Intel GPU driver, the oneAPI Base Toolkit, and Intel Extension for PyTorch. Set VLLM_TARGET_DEVICE=xpu during vLLM installation to compile XPU-specific extensions. At runtime, vLLM auto-detects Intel GPUs through IPEX's device enumeration and routes tensor operations through the XPU backend. Multi-GPU inference on Intel hardware uses the oneCCL (oneAPI Collective Communications Library) backend for distributed communication.

Requirements

Requirement	Value
GPU Hardware	Intel Data Center GPU Max series (Ponte Vecchio) or Intel Arc series
Intel GPU Driver	Latest stable Intel GPU driver for Linux
oneAPI Base Toolkit	2024.0+ (includes Level Zero runtime, oneMKL, oneCCL)
Intel Extension for PyTorch (IPEX)	Compatible version matching PyTorch release
PyTorch	XPU-enabled build of PyTorch
Python	>= 3.10
Operating System	Linux (Ubuntu 22.04+ or SUSE Linux Enterprise)
Level Zero Runtime	Included with oneAPI toolkit

Semantic Links

Implementation:Vllm_project_Vllm_IPEX_Ops

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment