Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Heuristic:Apache Spark Memory Tuning Tips

From Leeroopedia





Knowledge Sources
Domains Optimization, Memory_Management
Last Updated 2026-02-08 22:00 GMT

Overview

Memory management heuristics for Spark's unified memory model: 60% of (heap - 300MB) for execution+storage, GC tuning proportional to object count, and JVM limits at 200GB per executor.

Description

Spark uses a unified memory manager that divides JVM heap into regions for execution (shuffles, joins, sorts) and storage (cached RDDs). The key insight is that Spark reserves 300MB of heap for internal metadata and safety, then allocates 60% of the remainder as unified memory. Within this unified region, execution and storage can borrow from each other asymmetrically: execution can evict storage but storage cannot evict execution. Understanding this model is critical for tuning memory-intensive workloads.

Usage

Use these heuristics when you encounter OutOfMemoryError during Spark jobs, when tuning spark.memory.fraction and spark.memory.storageFraction, or when GC pauses are degrading performance. Also apply when deciding executor memory sizing for cluster configuration.

The Insight (Rule of Thumb)

  • Action: Spark reserves 300MB off the top of each executor's heap, then allocates 60% of the remainder as unified memory.
  • Value: `spark.memory.fraction` = 0.6 (default); `spark.memory.storageFraction` = 0.5 (default).
  • Trade-off: Increasing `spark.memory.fraction` above 0.6 reduces the safety buffer for user data structures and internal metadata, risking OOM on sparse or unusually large records.
  • Rule: For a 10GB executor, usable unified memory = (10GB - 300MB) * 0.6 = ~5.8GB. Of that, ~2.9GB defaults to storage.
  • JVM Limit: Never allocate more than 200GB to a single JVM. Instead, run multiple executors per node.
  • GC Strategy: Since Spark 4.0, G1GC is the default (was ParallelGC). With large heaps, increase `-XX:G1HeapRegionSize`.
  • Eden Sizing: For HDFS workloads with 128MB blocks and 3-4 concurrent tasks: set Eden to ~1.5GB (4 * 3 * 128MB).

Reasoning

Java object overhead is substantial: a 10-character String consumes ~60 bytes (not 10), HashMap entries wrap each key-value in an object with 16-byte headers plus 8-byte pointers. This 2-5x memory amplification means GC pressure is proportional to object count, not data size. Serialized storage (using Kryo or compressed formats) dramatically reduces both memory footprint and GC overhead by storing data as byte arrays rather than deserialized objects.

The asymmetric borrowing rule (execution can evict storage, but not vice versa) is an intentional design trade-off: execution memory is harder to spill cleanly mid-operation, while cached blocks can be recomputed from their lineage. This means cache operations may silently fail if execution has consumed the storage region.

The 200GB JVM limit reflects empirical observations that JVM garbage collectors behave poorly with very large heaps, producing long pause times that can cause executor timeouts and task failures.

Code Evidence

From `docs/tuning.md:133-142`:

spark.memory.fraction = 0.6 (default)
  - Size of M as fraction of (JVM heap - 300MB)
  - Rest 40% reserved for user data structures, internal metadata, OOM safeguarding

spark.memory.storageFraction = 0.5 (default)
  - Size of R as fraction of M
  - Storage space where cached blocks immune to eviction by execution

Java object overhead from `docs/tuning.md:92-106`:

- Java object header: ~16 bytes per object
- String overhead: ~40 bytes + (2 bytes per character) for UTF-16 encoding
- HashMap/LinkedList: Wrapper object per entry with header + pointers (8 bytes each)
- Primitive boxing: Creates java.lang.Integer objects instead of raw ints

JVM size limit from `docs/hardware-provisioning.md:52-68`:

Recommendation: 8 GB to hundreds of GB per machine
Allocate at most 75% of memory to Spark (rest for OS/buffer cache)

Java VM behavior > 200 GB RAM:
  - Launch multiple executors per node instead of single large executor

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment