Implementation:Ggml org Ggml Sycl backend api
Metadata
| Field | Value |
|---|---|
| Page Type | Implementation (API Doc) |
| Knowledge Sources | GGML |
| Domains | ML_Infrastructure, Tensor_Computing, GPU_Computing |
| Last Updated | 2025-05-15 12:00 GMT |
Overview
Public C header declaring the Intel SYCL backend interface for running tensor operations on Intel GPUs and other SYCL-compatible accelerators.
Description
ggml-sycl.h declares the SYCL backend's public API (49 lines). Licensed under MIT by Intel Corporation. It provides:
- Constants:
GGML_SYCL_NAME = "SYCL"andGGML_SYCL_MAX_DEVICES = 48(the highest device limit among GGML backends). - Backend initialization:
ggml_backend_sycl_init(device)creates a backend for a specific SYCL device by index. - Type checking:
ggml_backend_is_sycl(backend)verifies if a backend is SYCL-based. - Buffer types:
ggml_backend_sycl_buffer_type(device)-- device-specific buffer typeggml_backend_sycl_split_buffer_type(tensor_split)-- split buffer type that distributes matrices by rows across multiple devices for tensor parallelismggml_backend_sycl_host_buffer_type()-- pinned host buffer for faster CPU-GPU transfers
- Device enumeration and queries:
ggml_backend_sycl_print_sycl_devices-- prints all available SYCL devicesggml_backend_sycl_get_gpu_list-- retrieves GPU device IDsggml_backend_sycl_get_device_description-- gets human-readable device descriptionggml_backend_sycl_get_device_count-- returns number of available SYCL devicesggml_backend_sycl_get_device_memory-- queries free and total device memory
- Registration:
ggml_backend_sycl_reg()returns the registration handle.
Note: SYCL does not support host memory registration (commented-out declarations preserved for reference).
Usage
Include this header to use the SYCL backend for inference on Intel GPUs via the oneAPI programming model. The backend supports multi-device tensor parallelism via split buffer types.
Code Reference
Source Location
GGML repo, file: include/ggml-sycl.h (49 lines).
Signatures
#define GGML_SYCL_NAME "SYCL"
#define GGML_SYCL_MAX_DEVICES 48
GGML_BACKEND_API ggml_backend_t ggml_backend_sycl_init(int device);
GGML_BACKEND_API bool ggml_backend_is_sycl(ggml_backend_t backend);
GGML_BACKEND_API ggml_backend_buffer_type_t ggml_backend_sycl_buffer_type(int device);
GGML_BACKEND_API ggml_backend_buffer_type_t ggml_backend_sycl_split_buffer_type(const float * tensor_split);
GGML_BACKEND_API ggml_backend_buffer_type_t ggml_backend_sycl_host_buffer_type(void);
GGML_BACKEND_API void ggml_backend_sycl_print_sycl_devices(void);
GGML_BACKEND_API void ggml_backend_sycl_get_gpu_list(int * id_list, int max_len);
GGML_BACKEND_API void ggml_backend_sycl_get_device_description(int device, char * description, size_t description_size);
GGML_BACKEND_API int ggml_backend_sycl_get_device_count(void);
GGML_BACKEND_API void ggml_backend_sycl_get_device_memory(int device, size_t * free, size_t * total);
GGML_BACKEND_API ggml_backend_reg_t ggml_backend_sycl_reg(void);
Import
#include "ggml-sycl.h"
I/O Contract
Inputs
| Parameter | Type | Required | Description |
|---|---|---|---|
device |
int |
Yes | SYCL device index (0-based) for backend initialization, buffer type, and device queries. |
backend |
ggml_backend_t |
Yes | Backend handle for type checking. |
tensor_split |
const float * |
Yes | Array of split ratios across devices (for split_buffer_type).
|
id_list |
int * |
Yes | Output array for GPU device IDs (for get_gpu_list).
|
max_len |
int |
Yes | Maximum number of entries to write to id_list.
|
Outputs
| Output | Type | Description |
|---|---|---|
| Backend handle | ggml_backend_t |
Initialized SYCL backend for the specified device. |
| Type check | bool |
true if the backend is SYCL-based.
|
| Buffer type | ggml_backend_buffer_type_t |
Device, split, or host buffer type handle. |
| Device count | int |
Number of available SYCL devices. |
| Device memory | via output params | Free and total memory on the specified device. |
Usage Examples
#include "ggml-sycl.h"
// List available SYCL devices
ggml_backend_sycl_print_sycl_devices();
int n_devices = ggml_backend_sycl_get_device_count();
// Initialize backend on device 0
ggml_backend_t backend = ggml_backend_sycl_init(0);
// Query device memory
size_t free_mem, total_mem;
ggml_backend_sycl_get_device_memory(0, &free_mem, &total_mem);
// For multi-GPU: create split buffer type
float tensor_split[2] = { 0.5f, 0.5f }; // 50/50 split across 2 devices
ggml_backend_buffer_type_t split_buft = ggml_backend_sycl_split_buffer_type(tensor_split);