Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Apache Spark Cluster Config Pattern

From Leeroopedia


Metadata

Field Value
Source Type Doc
Source Name Spark Cluster Overview
Source URL https://spark.apache.org/docs/latest/cluster-overview.html
Domains Configuration
Type Pattern Doc

Overview

Configuration pattern documentation for the --master and --deploy-mode parameters used in Spark application submission.

Description

The cluster configuration pattern uses --master to specify the cluster manager URL and --deploy-mode to choose between client (driver local) and cluster (driver on cluster) execution. Additional --conf parameters fine-tune resource allocation (driver memory, executor memory, etc.).

This is a configuration pattern, not a single API. It spans the spark-submit CLI, SparkSession.builder, and SparkLauncher API — all of which accept the same master URL formats and deploy mode values.

Usage

Always specify --master when submitting to a cluster. Use --deploy-mode cluster for production and --deploy-mode client for development/debugging.

Code Reference

Source: docs/submitting-applications.md (L60-175), docs/cluster-overview.md (L25-122)

This is a configuration pattern, not a single API.

Master URL Formats

Master URL Description
local Run locally with 1 worker thread (no parallelism)
local[N] Run locally with N worker threads
local[*] Run locally with as many threads as logical cores
spark://host:port Connect to a Spark Standalone cluster manager
yarn Connect to a YARN cluster (reads config from HADOOP_CONF_DIR)
k8s://https://host:port Connect to a Kubernetes API server

I/O

Inputs

Parameter Type Required Default Description
--master String Yes Cluster manager URL
--deploy-mode String No client Driver execution location: client or cluster
--conf key=value pairs No Spark configuration properties
--driver-memory String No 1g Memory allocated to the driver process
--executor-memory String No 1g Memory allocated to each executor process

Outputs

Output Type Description
Configured submission parameters Configuration Complete set of parameters passed to the cluster manager

Examples

Local Mode

spark-submit --master local[4] app.jar

Standalone Cluster

spark-submit --master spark://master:7077 --deploy-mode cluster app.jar

YARN Cluster

spark-submit --master yarn --deploy-mode cluster app.jar

Kubernetes Cluster

spark-submit --master k8s://https://apiserver:6443 --deploy-mode cluster app.jar

Related

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment