Environment:Kserve Kserve Knative Serving

Knowledge Sources	KServe Knative Serving
Domains	Infrastructure, Serverless
Last Updated	2026-02-13 14:00 GMT

Overview

Knative Serving v1.15.2 providing serverless scaling, revision management, and traffic splitting for KServe InferenceServices.

Description

Knative Serving manages the lifecycle of KServe model serving pods in Serverless deployment mode. It provides automatic scale-to-zero, revision-based traffic splitting for canary deployments, and concurrency-based autoscaling via the Knative Pod Autoscaler (KPA). KServe creates Knative Services that wrap the model serving containers.

Usage

Use this environment when deploying KServe in Serverless mode. Knative Serving handles pod autoscaling, revision management for canary rollouts, and scale-to-zero for idle InferenceServices.

System Requirements

Category	Requirement	Notes
Kubernetes	>= 1.24	Base requirement
Knative Operator	v1.16.0	Manages Knative Serving installation
Knative Serving	1.15.2	Actual serving runtime
Istio	1.27.1	Required networking layer for Knative

Dependencies

Helm Charts

`knative-operator` from `oci://ghcr.io/knative/helm/knative-operator`

Credentials

No additional credentials required.

Quick Install

# Install Knative operator via Helm
helm install knative-operator oci://ghcr.io/knative/helm/knative-operator \
    --version "${KNATIVE_OPERATOR_VERSION}" -n knative-serving --create-namespace --wait

# Create KnativeServing resource
kubectl apply -f - <<EOF
apiVersion: operator.knative.dev/v1beta1
kind: KnativeServing
metadata:
  name: knative-serving
  namespace: knative-serving
spec:
  version: "${KNATIVE_SERVING_VERSION}"
EOF

Code Evidence

Knative dependency versions from `kserve-deps.env:20-21`:

KNATIVE_OPERATOR_VERSION=v1.16.0
KNATIVE_SERVING_VERSION=1.15.2

Autoscaler namespace config from `pkg/constants/constants.go:42`:

AutoscalerConfigmapNamespace = GetEnvOrDefault(
    "KNATIVE_CONFIG_AUTOSCALER_NAMESPACE", DefaultKnServingNamespace)

Common Errors

Error Message	Cause	Solution
`revision not ready`	Model container fails readiness probe	Check container logs for model loading errors
Scale-from-zero timeout	Cold start takes too long	Increase `progressDeadline` or set `minReplicas >= 1`

Compatibility Notes

OpenShift: Use OpenShift Serverless operator instead of upstream Knative
RawDeployment mode: Knative is not needed; KServe uses raw K8s Deployments
Autoscaler classes: Supports KPA (concurrency), HPA (CPU/memory), KEDA (custom), and External

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment