Implementation:Kserve Kserve ClusterServingRuntime CRD

Knowledge Sources	Kserve_Kserve KServe Docs
Domains	Kubernetes, CRD, Serving Runtime
Last Updated	2026-02-13 00:00 GMT

Overview

Concrete CRD definition for the ClusterServingRuntime custom resource in the KServe serving API.

Description

This file contains the auto-generated full CustomResourceDefinition for the ClusterServingRuntime kind, produced by controller-gen v0.19.0. It belongs to the serving.kserve.io API group at version v1alpha1 and is a cluster-scoped resource. The CRD defines serving runtime templates (e.g., TensorFlow Serving, Triton Inference Server, TorchServe) that are available to all namespaces, with comprehensive OpenAPI v3 validation for container specifications, supported model formats, and runtime configuration. It includes printer columns for Disabled status, ModelType, Containers, and Age.

Usage

Apply this CRD during KServe installation to register the ClusterServingRuntime API with the Kubernetes API server. Cluster administrators then create ClusterServingRuntime resources to define shared serving runtime templates that InferenceService resources across all namespaces can reference for model deployment.

Code Reference

Source Location

Repository: Kserve_Kserve
File: config/crd/full/serving.kserve.io_clusterservingruntimes.yaml
Lines: 1-4191

Signature

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  annotations:
    controller-gen.kubebuilder.io/version: v0.19.0
  name: clusterservingruntimes.serving.kserve.io
spec:
  group: serving.kserve.io
  names:
    kind: ClusterServingRuntime
    listKind: ClusterServingRuntimeList
    plural: clusterservingruntimes
    singular: clusterservingruntime
  scope: Cluster
  versions:
    - name: v1alpha1
      additionalPrinterColumns:
        - jsonPath: .spec.disabled
          name: Disabled
          type: boolean
        - jsonPath: .spec.supportedModelFormats[*].name
          name: ModelType
          type: string
        - jsonPath: .spec.containers[*].name
          name: Containers
          type: string
        - jsonPath: .metadata.creationTimestamp
          name: Age
          type: date
      subresources: {}

Import

kubectl apply -f config/crd/full/serving.kserve.io_clusterservingruntimes.yaml

I/O Contract

Inputs

Name	Type	Required	Description
apiVersion	string	Yes	Must be `serving.kserve.io/v1alpha1`
kind	string	Yes	Must be `ClusterServingRuntime`
metadata	ObjectMeta	Yes	Standard Kubernetes object metadata (no namespace since cluster-scoped)
spec	ServingRuntimeSpec	Yes	Serving runtime specification defining containers, model formats, and runtime behavior

Key spec fields:

Field	Type	Required	Description
spec.containers	[]Container	Yes	One or more container definitions for the serving runtime (image, command, args, env, resources, ports, volume mounts)
spec.supportedModelFormats	[]SupportedModelFormat	No	List of model formats supported by this runtime (e.g., sklearn, tensorflow, pytorch)
spec.disabled	boolean	No	When true, this runtime is disabled and will not be selected for model serving
spec.affinity	Affinity	No	Pod scheduling affinity rules (node affinity, pod affinity, pod anti-affinity)
spec.tolerations	[]Toleration	No	Pod tolerations for scheduling on tainted nodes
spec.volumes	[]Volume	No	Additional volumes to mount into the runtime containers

Outputs

Name	Type	Description
status	object	Empty status object (no status fields are currently defined for this resource)

Usage Examples

Create a ClusterServingRuntime

apiVersion: serving.kserve.io/v1alpha1
kind: ClusterServingRuntime
metadata:
  name: kserve-tritonserver
spec:
  supportedModelFormats:
    - name: tensorrt
      version: "8"
      autoSelect: true
    - name: onnx
      version: "1"
      autoSelect: true
  containers:
    - name: kserve-container
      image: nvcr.io/nvidia/tritonserver:23.05-py3
      args:
        - tritonserver
        - --model-store=/mnt/models
        - --grpc-port=9000
        - --http-port=8080
      resources:
        requests:
          cpu: "1"
          memory: 2Gi
        limits:
          cpu: "1"
          memory: 2Gi

List ClusterServingRuntimes

kubectl get clusterservingruntimes
# NAME                   DISABLED   MODELTYPE            CONTAINERS         AGE
# kserve-tritonserver    false      tensorrt,onnx        kserve-container   10d

Related Pages

Kserve_Kserve_ServingRuntime_CRD -- Namespace-scoped counterpart of this cluster-scoped resource
New principle needed: Kserve_Kserve_ClusterServingRuntime_Specification

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment