Implementation:Mage ai Mage ai Custom Spark Configuration
| Knowledge Sources | |
|---|---|
| Domains | Spark, Configuration, Data_Integration |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Helm values file for deploying a custom Apache Spark cluster as part of the Mage AI integration layer.
Description
The spark.yaml file is a Bitnami-based Helm chart values configuration (copyright VMware, Inc., Apache-2.0 licensed) that defines the deployment parameters for a custom Spark cluster integrated with Mage AI. At 1038 lines, it provides comprehensive configuration for every aspect of a Kubernetes-based Spark deployment.
Key configuration sections include:
- Global parameters -- Global Docker image registry, pull secrets, and storage class settings.
- Common parameters -- Kubernetes version overrides, namespace overrides, cluster domain (
cluster.local), common labels/annotations, init scripts (via dictionary, ConfigMap, or Secret), and a diagnostic mode toggle. - Spark image parameters -- Docker image registry (
docker.io), repository (templated as<DOCKERHUB>), tag (templated as<TAG>), pull policy (IfNotPresent), and debug mode. - Spark master parameters -- Container ports for HTTP (8080), HTTPS (8480), and cluster communication, along with existing ConfigMap references.
- Spark worker parameters -- Analogous worker-specific settings for scaling and resource allocation.
- Networking -- Host network configuration with DNS policy adjustments.
The templated placeholders (<DOCKERHUB> and <TAG>) indicate this file is meant to be customized per deployment, replacing these with the actual custom Spark image coordinates.
Usage
This YAML file is used as a Helm values override when deploying a Spark cluster via the Bitnami Spark Helm chart within the Mage AI ecosystem. It is typically applied with:
helm install mage-spark bitnami/spark -f integrations/custom_spark/spark.yaml
Operators should replace the <DOCKERHUB> and <TAG> placeholders with their custom Spark image coordinates before deployment.
Code Reference
Source Location
- Repository: mage-ai
- File: integrations/custom_spark/spark.yaml
- Lines: 1-1038
Signature
# Copyright VMware, Inc.
# SPDX-License-Identifier: APACHE-2.0
global:
imageRegistry: ""
imagePullSecrets: []
storageClass: ""
kubeVersion: ""
nameOverride: ""
fullnameOverride: ""
namespaceOverride: ""
commonLabels: {}
commonAnnotations: {}
clusterDomain: cluster.local
extraDeploy: []
initScripts: {}
initScriptsCM: ""
initScriptsSecret: ""
diagnosticMode:
enabled: false
command:
- sleep
args:
- infinity
image:
registry: docker.io
repository: <DOCKERHUB>
tag: <TAG>
pullPolicy: IfNotPresent
pullSecrets: []
debug: false
hostNetwork: false
master:
existingConfigmap: ""
containerPorts:
http: 8080
https: 8480
# ... cluster port and additional settings
Import
# Helm values file - not imported directly in code
# Applied via: helm install mage-spark bitnami/spark -f integrations/custom_spark/spark.yaml
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| global.imageRegistry | string | No | Global Docker image registry override |
| global.imagePullSecrets | array | No | Global Docker registry secret names |
| global.storageClass | string | No | Global StorageClass for Persistent Volumes |
| image.repository | string | Yes | Docker image repository for the custom Spark image |
| image.tag | string | Yes | Docker image tag for the custom Spark image |
| master.containerPorts.http | integer | No | HTTP port for the Spark master web UI (default: 8080) |
| master.containerPorts.https | integer | No | HTTPS port for the Spark master web UI (default: 8480) |
Outputs
| Name | Type | Description |
|---|---|---|
| Kubernetes resources | YAML manifests | Spark master and worker Deployments, Services, ConfigMaps, and related Kubernetes objects |
Usage Examples
# Deploy a custom Spark cluster with Mage AI integration
helm install mage-spark bitnami/spark \
-f integrations/custom_spark/spark.yaml \
--set image.repository=myregistry/mage-spark \
--set image.tag=1.0.0
# Verify the deployment
kubectl get pods -l app.kubernetes.io/name=spark
# Access the Spark master web UI
kubectl port-forward svc/mage-spark-master-svc 8080:8080