Implementation:Apache Spark K8s Config Properties
| Metadata | Value |
|---|---|
| Source | Doc: Running on K8s |
| Domains | Kubernetes, Configuration |
| Type | Pattern Doc |
| Related | Principle:Apache_Spark_K8s_Resource_Configuration |
Overview
Pattern documentation for spark.kubernetes.* configuration properties, pod templates, and RBAC used to configure Spark on Kubernetes.
Description
Spark on Kubernetes is configured through three mechanisms:
spark.kubernetes.*properties -- Passed via--confflags tospark-submit. These control the container image, namespace, resource requests and limits, secret mounts, volume mounts, environment variables, and labels/annotations applied to Spark pods.- Pod template YAML files -- Referenced via
spark.kubernetes.driver.podTemplateFileandspark.kubernetes.executor.podTemplateFile. These provide complete pod specifications that Spark uses as a starting point, overlaying its own required configuration on top. - Kubernetes RBAC resources -- ServiceAccount, Role/ClusterRole, and RoleBinding/ClusterRoleBinding resources that must be applied to the cluster before Spark submission. The
spark-rbac.yamlreference file provides a complete example.
Key properties include:
spark.kubernetes.container.image-- Required. The Docker image to use for driver and executor containers.spark.kubernetes.namespace-- The Kubernetes namespace to run in (default: current context namespace).spark.kubernetes.driver.podTemplateFile-- Path to a driver pod template YAML.spark.kubernetes.executor.podTemplateFile-- Path to an executor pod template YAML.spark.kubernetes.driver.secrets.[SecretName]-- Mount path for a named Kubernetes secret in the driver pod.spark.kubernetes.executor.secrets.[SecretName]-- Mount path for a named Kubernetes secret in executor pods.spark.kubernetes.driver.volumes.[Type].[Name].mount.path-- Mount path for a volume in the driver pod.spark.kubernetes.executor.volumes.[Type].[Name].mount.path-- Mount path for a volume in executor pods.
Spark is opinionated about certain pod template values and will always override them. Users should consult the pod template properties documentation to understand which values Spark manages.
Usage
Set required properties (at minimum spark.kubernetes.container.image) before submission. Use pod templates for advanced requirements such as node affinity, GPU scheduling, tolerations, or sidecar containers.
Code Reference
| Item | Reference |
|---|---|
| Configuration documentation | docs/running-on-kubernetes.md (L207-406)
|
| Driver pod template example | driver-template.yml
|
| Executor pod template example | executor-template.yml
|
| RBAC configuration | resource-managers/kubernetes/integration-tests/dev/spark-rbac.yaml
|
Key Properties
| Property | Description | Required |
|---|---|---|
spark.kubernetes.container.image |
Docker image for Spark containers | Yes |
spark.kubernetes.namespace |
Kubernetes namespace for pods | No (uses context default) |
spark.kubernetes.driver.podTemplateFile |
Path to driver pod template YAML | No |
spark.kubernetes.executor.podTemplateFile |
Path to executor pod template YAML | No |
spark.kubernetes.driver.secrets.[Name] |
Mount path for a secret in driver pod | No |
spark.kubernetes.executor.secrets.[Name] |
Mount path for a secret in executor pods | No |
spark.kubernetes.driver.volumes.[Type].[Name].mount.path |
Volume mount path for driver | No |
spark.kubernetes.executor.volumes.[Type].[Name].mount.path |
Volume mount path for executors | No |
spark.kubernetes.driver.request.cores |
CPU request for the driver pod | No |
spark.kubernetes.executor.request.cores |
CPU request for executor pods | No |
Inputs and Outputs
| Direction | Description |
|---|---|
| Inputs | spark.kubernetes.* properties, pod template YAML files, RBAC YAML resources
|
| Outputs | Configured Kubernetes resources (pods, services, secrets, volumes) for Spark submission |
Examples
Minimal configuration
--conf spark.kubernetes.container.image=spark:latest
With pod templates
--conf spark.kubernetes.driver.podTemplateFile=driver-template.yml
--conf spark.kubernetes.executor.podTemplateFile=executor-template.yml
With secrets
--conf spark.kubernetes.driver.secrets.my-secret=/etc/secrets
--conf spark.kubernetes.executor.secrets.my-secret=/etc/secrets
With secret environment variables
--conf spark.kubernetes.driver.secretKeyRef.ENV_NAME=name:key
--conf spark.kubernetes.executor.secretKeyRef.ENV_NAME=name:key
With volumes
--conf spark.kubernetes.driver.volumes.hostPath.spark-logs.mount.path=/var/log/spark
--conf spark.kubernetes.driver.volumes.hostPath.spark-logs.options.path=/tmp/spark-logs
--conf spark.kubernetes.executor.volumes.emptyDir.scratch.mount.path=/tmp/scratch
With pod template for node affinity
Example driver-template.yml:
apiVersion: v1
kind: Pod
metadata:
labels:
app: spark-driver
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-type
operator: In
values:
- spark-driver
tolerations:
- key: "spark-role"
operator: "Equal"
value: "driver"
effect: "NoSchedule"
containers:
- name: spark-driver
resources:
requests:
cpu: "2"
memory: "4Gi"