Environment:Kubeflow Kubeflow Kubernetes Cluster Environment
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Kubernetes, Platform_Deployment |
| Last Updated | 2026-02-13 00:00 GMT |
Overview
Kubernetes cluster environment with version 1.25+ required for deploying the Kubeflow AI Reference Platform.
Description
This environment defines the Kubernetes cluster requirements for running Kubeflow. The cluster must be running Kubernetes version 1.25 or later to support the CRDs, RBAC policies, and PodSecurityStandards used by Kubeflow components. The cluster operator must have cluster-admin permissions to create namespaces, CRDs, and cluster-scoped resources. The cluster should support LoadBalancer or NodePort services for external access via Istio ingress gateway.
Kubeflow versions track Kubernetes version support: v1.7 supported Kubernetes 1.25, v1.8 supported 1.25-1.26, v1.9 supported 1.29, and v1.10+ targets the latest stable Kubernetes releases.
Usage
Use this environment for any Platform Deployment workflow. It is the mandatory prerequisite for deploying Istio, cert-manager, Dex, and all Kubeflow components. Both the Kubeflow Manifests path and Packaged Distributions require a functioning Kubernetes cluster meeting these specifications.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Linux (cluster nodes) | Ubuntu 20.04+ or similar Linux distribution on cluster nodes |
| Kubernetes Version | >= 1.25 | v1.9 requires Kubernetes 1.29; check release notes for specific version compatibility |
| Hardware | Multi-node cluster recommended | Minimum 3 nodes with 4 CPU / 16GB RAM each for production |
| Network | LoadBalancer or NodePort | Required for Istio ingress gateway external access |
| Storage | Default StorageClass provisioned | PersistentVolume support required for pipelines, notebooks, and model artifacts |
| RBAC | cluster-admin | Deployer must have cluster-admin ClusterRoleBinding |
| Container Runtime | containerd or CRI-O | Docker runtime deprecated in Kubernetes 1.24+; Kubeflow 1.5+ switched to Emissary executor for containerd compatibility |
Dependencies
System Packages
- Kubernetes >= 1.25 (control plane and kubelets)
- containerd or CRI-O (container runtime)
- CoreDNS (cluster DNS, typically bundled)
- etcd (cluster state store, typically bundled)
Cluster Add-ons
- Default StorageClass with dynamic PV provisioning
- LoadBalancer controller (e.g., MetalLB for bare metal, or cloud provider integration)
- PodSecurityStandards support (enforced in Kubeflow 1.10+)
Credentials
The following credentials must be available to the cluster operator:
- KUBECONFIG: Path to kubeconfig file with cluster-admin permissions (defaults to ~/.kube/config)
Quick Install
# Verify cluster access and version
kubectl version
kubectl cluster-info
# Verify cluster-admin permissions
kubectl auth can-i create namespaces --all-namespaces
kubectl auth can-i create customresourcedefinitions --all-namespaces
# Verify default StorageClass
kubectl get storageclass
Code Evidence
Version requirements from README.md (referencing prerequisites):
The Kubeflow AI reference platform can be installed via Packaged Distributions
or Kubeflow Manifests.
Kubernetes version compatibility from ROADMAP.md:L49:
* Kubernetes 1.29 support
Kubernetes version compatibility from ROADMAP.md:L69:
* Kubernetes 1.25 and 1.26 support
PodSecurityStandards enforcement from ROADMAP.md:L12:
* PodSecurityStandards restricted is enforced for all system namespaces.
PodSecurityStandards baseline is enforced for user namespaces
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
Unable to connect to the server |
kubeconfig not set or cluster unreachable | Set KUBECONFIG or verify cluster endpoint with kubectl cluster-info
|
error: You must be logged in to the server (Unauthorized) |
Expired or invalid credentials in kubeconfig | Refresh cluster credentials (e.g., aws eks update-kubeconfig or gcloud container clusters get-credentials)
|
forbidden: User cannot create resource |
Insufficient RBAC permissions | Obtain cluster-admin ClusterRoleBinding from cluster owner |
no matches for kind "PodSecurityPolicy" |
Kubernetes 1.25+ removed PodSecurityPolicy | Upgrade to Kubeflow 1.10+ which uses PodSecurityStandards instead |
Compatibility Notes
- GKE (Google): Use
gcloud container clusters get-credentialsto obtain kubeconfig. Autopilot clusters have limitations with Istio sidecar injection. - EKS (AWS): Use
aws eks update-kubeconfig. Ensure the EBS CSI driver is installed for PersistentVolume support. - AKS (Azure): Use
az aks get-credentials. Enable the Azure Disk CSI driver for storage. - On-premise: Ensure MetalLB or equivalent LoadBalancer controller is deployed. PV provisioner must be configured for the default StorageClass.
- Kind/Minikube: Suitable for development only. Minikube requires
--memory=16384 --cpus=4minimum.
Related Pages
- Implementation:Kubeflow_Kubeflow_Kubectl_Kustomize_Version_Check
- Implementation:Kubeflow_Kubeflow_Istio_Certmanager_Dex_Setup
- Implementation:Kubeflow_Kubeflow_Kustomize_Component_Apply
- Implementation:Kubeflow_Kubeflow_Profile_CRD_RBAC_Setup
- Implementation:Kubeflow_Kubeflow_Kubectl_Health_Check
- Implementation:Kubeflow_Kubeflow_Notebook_CRD_Creation
- Implementation:Kubeflow_Kubeflow_TrainJob_CRD_Creation
- Implementation:Kubeflow_Kubeflow_Katib_Experiment_CRD
- Implementation:Kubeflow_Kubeflow_KServe_InferenceService_CRD