Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Apache Spark Start Workers

From Leeroopedia


Field Value
Source Repo Apache Spark
Domains Deployment
Type API Doc
Related Principle:Apache_Spark_Worker_Fleet_Management

Overview

Shell scripts that start Spark worker daemons on all configured cluster machines via SSH.

Description

Three scripts work together to start the worker fleet:

  • sbin/start-workers.sh -- entry point that reads the master URL and delegates to sbin/workers.sh
  • sbin/workers.sh -- iterates over hosts in conf/workers and SSHs to each one to execute the worker startup command
  • sbin/start-worker.sh -- runs on each worker machine to start the local Worker JVM process, supporting multiple instances per machine via SPARK_WORKER_INSTANCES

Workers automatically register with the master upon startup, making their resources (CPU cores, memory) available for application execution.

Usage

Run on the master node after start-master.sh. Workers register with the master automatically. No additional configuration is needed if conf/workers and SSH keys are properly set up.

Code Reference

Source: sbin/start-workers.sh (L1-47), sbin/start-worker.sh (L1-93), sbin/workers.sh (L1-121)

Signatures:

# Start all workers (run on master node, no arguments needed)
sbin/start-workers.sh

# Start a single worker on the local machine
sbin/start-worker.sh <master-url>

Key environment variables:

Variable Default Purpose
SPARK_WORKER_INSTANCES 1 Number of worker JVMs to start per machine
SPARK_WORKER_PORT (random) RPC port for the worker process
SPARK_WORKER_WEBUI_PORT 8081 HTTP port for the worker monitoring Web UI
SPARK_SSH_OPTS -o StrictHostKeyChecking=no SSH options for remote execution

I/O

Direction Description
Inputs Master URL (auto-detected from SPARK_MASTER_HOST:SPARK_MASTER_PORT), conf/workers host list, SSH keys
Outputs Running Worker JVM processes on all hosts, Web UIs at http://<worker>:8081

Examples

Start all workers from the master node:

./sbin/start-workers.sh

Start a single worker on the local machine:

./sbin/start-worker.sh spark://master:7077

Start multiple worker instances per machine:

SPARK_WORKER_INSTANCES=2 ./sbin/start-worker.sh spark://master:7077

Related

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment