Implementation:Kserve Kserve InferenceService Full CRD

Knowledge Sources	Kserve_Kserve KServe Docs
Domains	Kubernetes, CRD, Model Serving
Last Updated	2026-02-13 00:00 GMT

Overview

Concrete CRD definition for the InferenceService custom resource in the KServe serving API, providing full OpenAPI v3 schema validation.

Description

This file contains the auto-generated full CustomResourceDefinition for the InferenceService kind (short name: isvc), produced by controller-gen v0.19.0. It belongs to the serving.kserve.io API group at version v1beta1 and is a namespaced resource. This is the core CRD of the entire KServe platform, defining the API through which users deploy, version, and manage ML model serving endpoints with traffic splitting, autoscaling, and canary rollouts. The CRD includes printer columns for URL, Ready status, Prev/Latest traffic percentages, revision names, and Age, and defines a status subresource with x-kubernetes-preserve-unknown-fields for extensibility.

Usage

Apply this CRD during KServe installation to register the InferenceService API with the Kubernetes API server. Once registered, users can create InferenceService resources to deploy predictor, transformer, and explainer components for serving ML models, with full schema validation enforced by the API server.

Code Reference

Source Location

Repository: Kserve_Kserve
File: config/crd/full/serving.kserve.io_inferenceservices.yaml
Lines: 1-22814

Signature

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  annotations:
    controller-gen.kubebuilder.io/version: v0.19.0
  name: inferenceservices.serving.kserve.io
spec:
  group: serving.kserve.io
  names:
    kind: InferenceService
    listKind: InferenceServiceList
    plural: inferenceservices
    shortNames:
      - isvc
    singular: inferenceservice
  scope: Namespaced
  versions:
    - name: v1beta1
      additionalPrinterColumns:
        - jsonPath: .status.url
          name: URL
          type: string
        - jsonPath: .status.conditions[?(@.type=='Ready')].status
          name: Ready
          type: string
        - jsonPath: .status.components.predictor.traffic[?(@.tag=='prev')].percent
          name: Prev
          type: integer
        - jsonPath: .status.components.predictor.traffic[?(@.latestRevision==true)].percent
          name: Latest
          type: integer
        - jsonPath: .status.components.predictor.traffic[?(@.tag=='prev')].revisionName
          name: PrevRolledoutRevision
          type: string
        - jsonPath: .status.components.predictor.traffic[?(@.latestRevision==true)].revisionName
          name: LatestReadyRevision
          type: string
        - jsonPath: .metadata.creationTimestamp
          name: Age
          type: date
      subresources:
        status: {}

Import

kubectl apply -f config/crd/full/serving.kserve.io_inferenceservices.yaml

I/O Contract

Inputs

Name	Type	Required	Description
apiVersion	string	Yes	Must be `serving.kserve.io/v1beta1`
kind	string	Yes	Must be `InferenceService`
metadata	ObjectMeta	Yes	Standard Kubernetes object metadata
spec	InferenceServiceSpec	Yes	Service specification defining predictor, transformer, and explainer components

Key spec fields:

Field	Type	Required	Description
spec.predictor	PredictorSpec	Yes (implicit)	Predictor component specification with full pod template support, including built-in framework support (sklearn, tensorflow, pytorch, xgboost, triton, etc.) and custom container definitions
spec.transformer	TransformerSpec	No	Optional pre/post-processing transformer component with full pod template support
spec.explainer	ExplainerSpec	No	Optional model explainability component with full pod template support

Each component (predictor, transformer, explainer) supports the full Kubernetes PodSpec including:

Container definitions with image, command, args, env, resources, ports, volume mounts
Affinity, tolerations, topology spread constraints
Init containers, service account, security context
Active deadline seconds, DNS policy, host networking

Outputs

Name	Type	Description
status.url	string	Primary URL endpoint for the inference service
status.conditions	[]Condition	Knative-style conditions including Ready status
status.components	ComponentStatuses	Per-component status with traffic routing, revision names, and URLs for predictor, transformer, and explainer
status.modelStatus	ModelStatus	Model-level status including active/target model state (Pending, Standby, Loading, Loaded, FailedToLoad), transition status, and failure info with reason enums (ModelLoadFailed, RuntimeUnhealthy, RuntimeDisabled, NoSupportingRuntime, RuntimeNotRecognized, InvalidPredictorSpec)
status.observedGeneration	int64	The generation most recently observed by the controller
status.servingRuntimeName	string	Name of the serving runtime selected for this inference service

Usage Examples

Create an InferenceService

apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
  name: sklearn-iris
  namespace: default
spec:
  predictor:
    model:
      modelFormat:
        name: sklearn
      storageUri: gs://kfserving-examples/models/sklearn/1.0/model
      resources:
        requests:
          cpu: 100m
          memory: 256Mi

Check Status with Traffic Split

kubectl get isvc sklearn-iris
# NAME          URL                                        READY   PREV   LATEST   PREVROLLEDOUTREVISION   LATESTREADYREVISION          AGE
# sklearn-iris  http://sklearn-iris.default.example.com    True           100                              sklearn-iris-predictor-001    5m

Related Pages

Principle:Kserve_Kserve_InferenceService_Specification
Kserve_Kserve_InferenceService_CRD_Spec -- Go type definitions for the InferenceService resource
Kserve_Kserve_InferenceServiceReconciler -- Controller that reconciles InferenceService resources
Kserve_Kserve_InferenceService_Webhook_Chain -- Admission webhook validation and defaulting
Kserve_Kserve_Knative_Traffic_Reconciler -- Traffic splitting reconciliation logic
Kserve_Kserve_CanaryTrafficPercent_Spec -- Canary traffic percentage specification
Kserve_Kserve_Revision_Status_Propagation -- Revision status propagation logic

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment