Implementation:Kserve Kserve RDMA Network Configuration
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Networking, HPC, GPU_Computing |
| Last Updated | 2026-02-13 00:00 GMT |
Overview
Concrete YAML pattern for provisioning SR-IOV RDMA network interfaces for disaggregated LLM serving with KV cache transfer.
Description
The network-roce.yaml manifest creates SriovNetworkNodePolicy and SriovNetwork resources for two physical NIC ports (p2, p13) on Mellanox/NVIDIA ConnectX adapters. Each policy configures 8 SR-IOV virtual functions with RDMA enabled, jumbo frames (MTU 9000), and Whereabouts IPAM for IP assignment.
Usage
Apply this manifest on clusters with Mellanox ConnectX NICs and the SR-IOV Network Operator installed. Pods reference the network attachments via Multus annotations.
Code Reference
Source Location
- Repository: kserve
- File: docs/samples/llmisvc/dp-ep/deepseek-r1-gpu-rdma-roce/network-roce.yaml, Lines 1-92
Signature
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
name: p2
namespace: openshift-sriov-network-operator
spec:
deviceType: netdevice
isRdma: true
linkType: eth
mtu: 9000
nicSelector:
vendor: "15b3" # Mellanox/NVIDIA
pfNames: ["ens6f0np0#0-7"]
nodeSelector:
feature.node.kubernetes.io/rdma.available: "true"
feature.node.kubernetes.io/pci-15b3.sriov.capable: "true"
numVfs: 8
resourceName: p2rdma
---
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetwork
metadata:
name: roce-p2
namespace: openshift-sriov-network-operator
spec:
ipam: '{"type": "whereabouts", "range": "10.0.132.0/24"}'
networkNamespace: default
resourceName: p2rdma
spoofChk: "off"
trust: "on"
Import
kubectl apply -f network-roce.yaml
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| Mellanox NICs | Hardware | Yes | ConnectX adapters with SR-IOV capability |
| SR-IOV Operator | Operator | Yes | Manages SR-IOV network resources |
| Node Feature Discovery | DaemonSet | Yes | Labels nodes with rdma.available |
Outputs
| Name | Type | Description |
|---|---|---|
| rdma/roce_gdr | resource | Schedulable RDMA resource on nodes |
| roce-p2 | NetworkAttachmentDefinition | Network attachment for pod Multus annotation |
| roce-p13 | NetworkAttachmentDefinition | Second NIC port network attachment |
Usage Examples
Pod RDMA Annotation
# In LLMInferenceService pod template:
metadata:
annotations:
k8s.v1.cni.cncf.io/networks: roce-p2
spec:
containers:
- resources:
limits:
rdma/roce_gdr: 1
nvidia.com/gpu: "8"
Related Pages
Implements Principle
Requires Environment
Uses Heuristic
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment