Implementation:Kserve Kserve Initial InferenceService Deployment
| Knowledge Sources | |
|---|---|
| Domains | MLOps, Deployment_Strategy, Model_Serving |
| Last Updated | 2026-02-13 00:00 GMT |
Overview
Concrete YAML pattern for deploying the initial baseline InferenceService that serves as the stable reference for canary rollouts.
Description
This pattern creates an InferenceService with a single predictor and no canaryTrafficPercent field. The absence of canary configuration means Knative routes 100% of traffic to the single active revision. This establishes the baseline that subsequent canary updates will be compared against.
Usage
Use this pattern as the first step in a canary rollout workflow. Apply this YAML, verify the service is ready and serving correct predictions, then proceed with canary updates.
Code Reference
Source Location
- Repository: kserve
- File: docs/samples/v1beta1/rollout/default.yaml, Lines 1-8
Signature
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
name: my-model
spec:
predictor:
tensorflow:
storageUri: "gs://kfserving-examples/models/tensorflow/flowers"
Import
kubectl apply -f default.yaml
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| metadata.name | string | Yes | InferenceService name |
| spec.predictor.tensorflow.storageUri | string | Yes | Model artifact URI |
Outputs
| Name | Type | Description |
|---|---|---|
| Revision | Knative Revision | Single revision <name>-predictor-default-00001 |
| Traffic | 100% | All traffic to the single revision |
| status.url | URL | Prediction endpoint ready for requests |
Usage Examples
Deploy Baseline Model
# 1. Deploy initial model
kubectl apply -f - <<EOF
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
name: my-model
spec:
predictor:
tensorflow:
storageUri: "gs://kfserving-examples/models/tensorflow/flowers"
EOF
# 2. Wait for readiness
kubectl wait inferenceservice my-model --for=condition=Ready --timeout=120s
# 3. Verify single revision
kubectl get revisions -l serving.knative.dev/service=my-model-predictor-default
# NAME READY
# my-model-predictor-default-00001 True