Principle:Apache Spark K8s Prerequisites Verification
| Metadata | Value |
|---|---|
| Domains | Kubernetes, Deployment |
| Type | Principle |
| Related | Implementation:Apache_Spark_Kubectl_Auth_Check |
Overview
A pre-flight check pattern that validates cluster connectivity, RBAC permissions, and DNS resolution before deploying applications to a Kubernetes cluster.
Description
Before deploying Spark on Kubernetes, several prerequisites must be verified to ensure a successful submission. This defensive validation pattern catches configuration issues early, preventing obscure runtime failures that are difficult to diagnose in a distributed container environment.
The verification sequence covers three critical areas:
- Kubernetes API server connectivity -- The cluster must be reachable from the submission client. This is typically validated by confirming that
kubectlcan communicate with the API server at the expected host and port. - RBAC permissions -- The service account used by the Spark driver pods must have sufficient role-based access control permissions. Specifically, the account needs the ability to create, list, edit, and delete pods, services, and configmaps. Without these permissions, the driver cannot spawn executor pods or create the driver service for executor-to-driver communication.
- Cluster DNS resolution -- Kubernetes DNS must be configured and functional. Spark uses Kubernetes services for driver-executor communication, and service discovery relies on DNS. If DNS is broken, executors cannot locate the driver pod.
The pattern follows a fail-fast approach: each check is performed in sequence, and failure at any stage halts the process with a clear diagnostic message rather than allowing the deployment to proceed into an undefined state.
Usage
Use this verification pattern in the following scenarios:
- First-time deployment -- When deploying Spark to a new Kubernetes cluster for the first time, run all pre-flight checks to validate the environment.
- Troubleshooting submission failures -- When
spark-submitfails with permission-related or connectivity errors, use the individual checks to isolate the root cause. - CI/CD pipeline gates -- Incorporate pre-flight checks as a gate in automated deployment pipelines to prevent deploying to misconfigured clusters.
- After cluster upgrades -- When the Kubernetes cluster is upgraded or RBAC policies are modified, re-run the checks to confirm that Spark-related permissions remain intact.
Theoretical Basis
The verification follows a sequential dependency chain where each step must succeed before the next is meaningful:
verify(cluster_connectivity)
-> verify(rbac_permissions)
-> verify(dns_resolution)
-> proceed_or_fail_fast
This ordering is deliberate:
- Connectivity must be verified first because RBAC checks require API server communication.
- RBAC must be verified before DNS because even if DNS works, missing permissions will prevent pod creation.
- DNS is verified last because it is a runtime dependency rather than a submission-time dependency.
The fail-fast behavior ensures that operators receive the earliest possible and most specific error, rather than a cascading chain of downstream failures.
Minimum Required Permissions
The following table summarizes the RBAC permissions required for Spark on Kubernetes:
| Resource | Required Verbs | Purpose |
|---|---|---|
| pods | create, list, edit, delete | Driver creates executor pods; monitors and cleans them up |
| services | create, delete | Driver service enables executor-to-driver communication |
| configmaps | create, delete | Used for Spark configuration distribution |