Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Kserve Kserve ClusterServingRuntime CRD

From Leeroopedia
Knowledge Sources
Domains Kubernetes, CRD, Serving Runtime
Last Updated 2026-02-13 00:00 GMT

Overview

Concrete CRD definition for the ClusterServingRuntime custom resource in the KServe serving API.

Description

This file contains the auto-generated full CustomResourceDefinition for the ClusterServingRuntime kind, produced by controller-gen v0.19.0. It belongs to the serving.kserve.io API group at version v1alpha1 and is a cluster-scoped resource. The CRD defines serving runtime templates (e.g., TensorFlow Serving, Triton Inference Server, TorchServe) that are available to all namespaces, with comprehensive OpenAPI v3 validation for container specifications, supported model formats, and runtime configuration. It includes printer columns for Disabled status, ModelType, Containers, and Age.

Usage

Apply this CRD during KServe installation to register the ClusterServingRuntime API with the Kubernetes API server. Cluster administrators then create ClusterServingRuntime resources to define shared serving runtime templates that InferenceService resources across all namespaces can reference for model deployment.

Code Reference

Source Location

Signature

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  annotations:
    controller-gen.kubebuilder.io/version: v0.19.0
  name: clusterservingruntimes.serving.kserve.io
spec:
  group: serving.kserve.io
  names:
    kind: ClusterServingRuntime
    listKind: ClusterServingRuntimeList
    plural: clusterservingruntimes
    singular: clusterservingruntime
  scope: Cluster
  versions:
    - name: v1alpha1
      additionalPrinterColumns:
        - jsonPath: .spec.disabled
          name: Disabled
          type: boolean
        - jsonPath: .spec.supportedModelFormats[*].name
          name: ModelType
          type: string
        - jsonPath: .spec.containers[*].name
          name: Containers
          type: string
        - jsonPath: .metadata.creationTimestamp
          name: Age
          type: date
      subresources: {}

Import

kubectl apply -f config/crd/full/serving.kserve.io_clusterservingruntimes.yaml

I/O Contract

Inputs

Name Type Required Description
apiVersion string Yes Must be serving.kserve.io/v1alpha1
kind string Yes Must be ClusterServingRuntime
metadata ObjectMeta Yes Standard Kubernetes object metadata (no namespace since cluster-scoped)
spec ServingRuntimeSpec Yes Serving runtime specification defining containers, model formats, and runtime behavior

Key spec fields:

Field Type Required Description
spec.containers []Container Yes One or more container definitions for the serving runtime (image, command, args, env, resources, ports, volume mounts)
spec.supportedModelFormats []SupportedModelFormat No List of model formats supported by this runtime (e.g., sklearn, tensorflow, pytorch)
spec.disabled boolean No When true, this runtime is disabled and will not be selected for model serving
spec.affinity Affinity No Pod scheduling affinity rules (node affinity, pod affinity, pod anti-affinity)
spec.tolerations []Toleration No Pod tolerations for scheduling on tainted nodes
spec.volumes []Volume No Additional volumes to mount into the runtime containers

Outputs

Name Type Description
status object Empty status object (no status fields are currently defined for this resource)

Usage Examples

Create a ClusterServingRuntime

apiVersion: serving.kserve.io/v1alpha1
kind: ClusterServingRuntime
metadata:
  name: kserve-tritonserver
spec:
  supportedModelFormats:
    - name: tensorrt
      version: "8"
      autoSelect: true
    - name: onnx
      version: "1"
      autoSelect: true
  containers:
    - name: kserve-container
      image: nvcr.io/nvidia/tritonserver:23.05-py3
      args:
        - tritonserver
        - --model-store=/mnt/models
        - --grpc-port=9000
        - --http-port=8080
      resources:
        requests:
          cpu: "1"
          memory: 2Gi
        limits:
          cpu: "1"
          memory: 2Gi

List ClusterServingRuntimes

kubectl get clusterservingruntimes
# NAME                   DISABLED   MODELTYPE            CONTAINERS         AGE
# kserve-tritonserver    false      tensorrt,onnx        kserve-container   10d

Related Pages

  • Kserve_Kserve_ServingRuntime_CRD -- Namespace-scoped counterpart of this cluster-scoped resource
  • New principle needed: Kserve_Kserve_ClusterServingRuntime_Specification

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment