Implementation:Kserve Kserve InferenceService Full CRD
| Knowledge Sources | |
|---|---|
| Domains | Kubernetes, CRD, Model Serving |
| Last Updated | 2026-02-13 00:00 GMT |
Overview
Concrete CRD definition for the InferenceService custom resource in the KServe serving API, providing full OpenAPI v3 schema validation.
Description
This file contains the auto-generated full CustomResourceDefinition for the InferenceService kind (short name: isvc), produced by controller-gen v0.19.0. It belongs to the serving.kserve.io API group at version v1beta1 and is a namespaced resource. This is the core CRD of the entire KServe platform, defining the API through which users deploy, version, and manage ML model serving endpoints with traffic splitting, autoscaling, and canary rollouts. The CRD includes printer columns for URL, Ready status, Prev/Latest traffic percentages, revision names, and Age, and defines a status subresource with x-kubernetes-preserve-unknown-fields for extensibility.
Usage
Apply this CRD during KServe installation to register the InferenceService API with the Kubernetes API server. Once registered, users can create InferenceService resources to deploy predictor, transformer, and explainer components for serving ML models, with full schema validation enforced by the API server.
Code Reference
Source Location
- Repository: Kserve_Kserve
- File: config/crd/full/serving.kserve.io_inferenceservices.yaml
- Lines: 1-22814
Signature
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.19.0
name: inferenceservices.serving.kserve.io
spec:
group: serving.kserve.io
names:
kind: InferenceService
listKind: InferenceServiceList
plural: inferenceservices
shortNames:
- isvc
singular: inferenceservice
scope: Namespaced
versions:
- name: v1beta1
additionalPrinterColumns:
- jsonPath: .status.url
name: URL
type: string
- jsonPath: .status.conditions[?(@.type=='Ready')].status
name: Ready
type: string
- jsonPath: .status.components.predictor.traffic[?(@.tag=='prev')].percent
name: Prev
type: integer
- jsonPath: .status.components.predictor.traffic[?(@.latestRevision==true)].percent
name: Latest
type: integer
- jsonPath: .status.components.predictor.traffic[?(@.tag=='prev')].revisionName
name: PrevRolledoutRevision
type: string
- jsonPath: .status.components.predictor.traffic[?(@.latestRevision==true)].revisionName
name: LatestReadyRevision
type: string
- jsonPath: .metadata.creationTimestamp
name: Age
type: date
subresources:
status: {}
Import
kubectl apply -f config/crd/full/serving.kserve.io_inferenceservices.yaml
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| apiVersion | string | Yes | Must be serving.kserve.io/v1beta1
|
| kind | string | Yes | Must be InferenceService
|
| metadata | ObjectMeta | Yes | Standard Kubernetes object metadata |
| spec | InferenceServiceSpec | Yes | Service specification defining predictor, transformer, and explainer components |
Key spec fields:
| Field | Type | Required | Description |
|---|---|---|---|
| spec.predictor | PredictorSpec | Yes (implicit) | Predictor component specification with full pod template support, including built-in framework support (sklearn, tensorflow, pytorch, xgboost, triton, etc.) and custom container definitions |
| spec.transformer | TransformerSpec | No | Optional pre/post-processing transformer component with full pod template support |
| spec.explainer | ExplainerSpec | No | Optional model explainability component with full pod template support |
Each component (predictor, transformer, explainer) supports the full Kubernetes PodSpec including:
- Container definitions with image, command, args, env, resources, ports, volume mounts
- Affinity, tolerations, topology spread constraints
- Init containers, service account, security context
- Active deadline seconds, DNS policy, host networking
Outputs
| Name | Type | Description |
|---|---|---|
| status.url | string | Primary URL endpoint for the inference service |
| status.conditions | []Condition | Knative-style conditions including Ready status |
| status.components | ComponentStatuses | Per-component status with traffic routing, revision names, and URLs for predictor, transformer, and explainer |
| status.modelStatus | ModelStatus | Model-level status including active/target model state (Pending, Standby, Loading, Loaded, FailedToLoad), transition status, and failure info with reason enums (ModelLoadFailed, RuntimeUnhealthy, RuntimeDisabled, NoSupportingRuntime, RuntimeNotRecognized, InvalidPredictorSpec) |
| status.observedGeneration | int64 | The generation most recently observed by the controller |
| status.servingRuntimeName | string | Name of the serving runtime selected for this inference service |
Usage Examples
Create an InferenceService
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
name: sklearn-iris
namespace: default
spec:
predictor:
model:
modelFormat:
name: sklearn
storageUri: gs://kfserving-examples/models/sklearn/1.0/model
resources:
requests:
cpu: 100m
memory: 256Mi
Check Status with Traffic Split
kubectl get isvc sklearn-iris
# NAME URL READY PREV LATEST PREVROLLEDOUTREVISION LATESTREADYREVISION AGE
# sklearn-iris http://sklearn-iris.default.example.com True 100 sklearn-iris-predictor-001 5m
Related Pages
- Principle:Kserve_Kserve_InferenceService_Specification
- Kserve_Kserve_InferenceService_CRD_Spec -- Go type definitions for the InferenceService resource
- Kserve_Kserve_InferenceServiceReconciler -- Controller that reconciles InferenceService resources
- Kserve_Kserve_InferenceService_Webhook_Chain -- Admission webhook validation and defaulting
- Kserve_Kserve_Knative_Traffic_Reconciler -- Traffic splitting reconciliation logic
- Kserve_Kserve_CanaryTrafficPercent_Spec -- Canary traffic percentage specification
- Kserve_Kserve_Revision_Status_Propagation -- Revision status propagation logic