Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ggml org Ggml Rpc backend api

From Leeroopedia


Metadata

Field Value
Page Type Implementation (API Doc)
Knowledge Sources GGML
Domains ML_Infrastructure, Tensor_Computing, Distributed_Computing
Last Updated 2025-05-15 12:00 GMT

Overview

Public C header declaring the RPC (Remote Procedure Call) backend interface for offloading tensor computation to remote servers over a network.

Description

ggml-rpc.h declares the RPC backend's public API (30 lines). It provides:

  1. Protocol version constants: RPC_PROTO_MAJOR_VERSION = 3, RPC_PROTO_MINOR_VERSION = 6, RPC_PROTO_PATCH_VERSION = 0. These ensure client-server compatibility via the HELLO handshake.
  2. Server limit: GGML_RPC_MAX_SERVERS = 16 -- maximum number of remote servers that can be registered simultaneously.
  3. Client functions:
    • ggml_backend_rpc_init -- connects to a remote server at a given endpoint and device index
    • ggml_backend_is_rpc -- type-checks whether a backend is RPC-based
    • ggml_backend_rpc_buffer_type -- returns the buffer type for remote memory allocation
    • ggml_backend_rpc_get_device_memory -- queries free and total memory on a remote device
  4. Server function:
    • ggml_backend_rpc_start_server -- starts an RPC server exposing local backend devices over the network, with optional data caching
  5. Registration:
    • ggml_backend_rpc_reg -- returns the backend registration handle
    • ggml_backend_rpc_add_server -- dynamically registers a new remote server endpoint

Usage

Include this header to use the RPC backend for distributed inference. The client connects to a remote server endpoint, and all GGML operations are transparently forwarded over the network.

Code Reference

Source Location

GGML repo, file: include/ggml-rpc.h (30 lines).

Signatures

#define RPC_PROTO_MAJOR_VERSION    3
#define RPC_PROTO_MINOR_VERSION    6
#define RPC_PROTO_PATCH_VERSION    0
#define GGML_RPC_MAX_SERVERS       16

GGML_BACKEND_API ggml_backend_t ggml_backend_rpc_init(const char * endpoint, uint32_t device);
GGML_BACKEND_API bool ggml_backend_is_rpc(ggml_backend_t backend);
GGML_BACKEND_API ggml_backend_buffer_type_t ggml_backend_rpc_buffer_type(const char * endpoint, uint32_t device);
GGML_BACKEND_API void ggml_backend_rpc_get_device_memory(const char * endpoint, uint32_t device, size_t * free, size_t * total);
GGML_BACKEND_API void ggml_backend_rpc_start_server(const char * endpoint, const char * cache_dir,
                                                    size_t n_threads, size_t n_devices, ggml_backend_dev_t * devices);
GGML_BACKEND_API ggml_backend_reg_t ggml_backend_rpc_reg(void);
GGML_BACKEND_API ggml_backend_reg_t ggml_backend_rpc_add_server(const char * endpoint);

Import

#include "ggml-rpc.h"

I/O Contract

Inputs

Parameter Type Required Description
endpoint const char * Yes Network address in host:port format for the RPC server.
device uint32_t Yes Device index on the remote server (0-based).
cache_dir const char * No Server-side cache directory for tensor data deduplication.
n_threads size_t Yes Number of server worker threads.
n_devices size_t Yes Number of local backend devices to expose.
devices ggml_backend_dev_t * Yes Array of local backend devices to serve remotely.

Outputs

Output Type Description
Backend handle ggml_backend_t RPC client backend proxying all operations to the remote server.
Type check bool true if the backend is RPC-based.
Buffer type ggml_backend_buffer_type_t Buffer type for remote memory allocation.
Device memory via output params Free and total memory on the remote device.
Registration ggml_backend_reg_t Registration handle for the backend system.

Usage Examples

#include "ggml-rpc.h"

// Connect to a remote server
ggml_backend_t backend = ggml_backend_rpc_init("192.168.1.100:50052", 0);

// Query remote memory
size_t free_mem, total_mem;
ggml_backend_rpc_get_device_memory("192.168.1.100:50052", 0, &free_mem, &total_mem);

// Dynamically register another server
ggml_backend_reg_t reg = ggml_backend_rpc_add_server("192.168.1.101:50052");

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment