Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Workflow:Apache Spark Standalone Cluster Deployment

From Leeroopedia


Knowledge Sources
Domains Cluster_Management, Infrastructure, Operations
Last Updated 2026-02-08 22:00 GMT

Overview

End-to-end process for deploying and managing a Spark standalone cluster, from initial installation through master/worker startup to cluster shutdown.

Description

This workflow covers deploying Spark in its built-in standalone cluster mode, which provides a simple way to run Spark without requiring external cluster managers like YARN or Kubernetes. The standalone mode includes a master process that coordinates resource allocation and one or more worker processes that execute application tasks. The deployment uses shell scripts in sbin/ for lifecycle management and supports both manual single-node setup and automated multi-node deployment via SSH.

Usage

Execute this workflow when you need to set up a dedicated Spark cluster without the overhead of YARN or Kubernetes. This is suitable for teams that want full control over their Spark deployment, small-to-medium cluster environments, development and testing clusters, and scenarios where Spark is the primary workload.

Execution Steps

Step 1: Installation

Place a compiled Spark distribution on each node in the cluster. This can be a pre-built release downloaded from the Spark website or a custom build created via dev/make-distribution.sh. All nodes must have the same Spark version and compatible Java installations.

Key considerations:

  • All nodes need identical Spark distributions
  • Java 17 or 21 must be installed on every node
  • The distribution should be placed at the same path on all machines
  • Network connectivity between all nodes is required

Step 2: Cluster Configuration

Configure the cluster by setting environment variables in conf/spark-env.sh and listing worker hostnames in conf/workers. The environment file controls master/worker resource limits, ports, and directories. The workers file enables automated multi-node deployment via the cluster launch scripts.

Key considerations:

  • Create conf/spark-env.sh from the provided template
  • Set SPARK_MASTER_HOST to bind the master to a specific address
  • Configure SPARK_WORKER_CORES and SPARK_WORKER_MEMORY per worker
  • List all worker hostnames (one per line) in conf/workers
  • Copy spark-env.sh to all worker machines

Step 3: Starting the Master

Launch the Spark master daemon using sbin/start-master.sh. The master process manages cluster resources and serves the Web UI for monitoring. Once started, it prints its spark:// URL which workers use to register.

Key considerations:

  • Default master port is 7077; Web UI is on port 8080
  • The master URL format is spark://HOST:PORT
  • The Web UI displays registered workers, running applications, and resource usage
  • For high availability, multiple masters can be configured with ZooKeeper

Step 4: Starting Workers

Launch worker daemons that register with the master. Workers can be started individually on each machine or collectively via sbin/start-workers.sh which uses the conf/workers file and SSH. Each worker offers its CPU and memory resources to the cluster.

Key considerations:

  • Use sbin/start-worker.sh MASTER_URL for individual workers
  • Use sbin/start-workers.sh for automated multi-node startup via SSH
  • Password-less SSH is required for the cluster launch scripts
  • Workers report their resources to the master (total RAM minus 1 GiB by default)
  • Use sbin/start-all.sh to start both master and all workers

Step 5: Application Execution

Submit applications to the running cluster using spark-submit with the master URL. The standalone cluster supports both client and cluster deploy modes. Resource allocation is managed by the master, which assigns executors on available workers.

Key considerations:

  • Use --master spark://HOST:7077 to target the standalone cluster
  • In cluster mode, --supervise enables automatic driver restart on failure
  • Resource limits are enforced per-application based on configuration
  • The History Server (sbin/start-history-server.sh) provides post-run log viewing

Step 6: Cluster Shutdown

Stop the cluster daemons using the sbin/ shutdown scripts. Workers can be stopped individually or collectively, and the master has its own stop script. Use sbin/stop-all.sh for a complete cluster shutdown.

Key considerations:

  • sbin/stop-workers.sh stops workers on all machines listed in conf/workers
  • sbin/stop-master.sh stops the master on the current machine
  • sbin/stop-all.sh stops both master and all workers
  • Graceful decommissioning is available via sbin/decommission-worker.sh

Execution Diagram

GitHub URL

Workflow Repository