Heuristic:Apache Spark Memory Tuning Tips
| Knowledge Sources | |
|---|---|
| Domains | Optimization, Memory_Management |
| Last Updated | 2026-02-08 22:00 GMT |
Overview
Memory management heuristics for Spark's unified memory model: 60% of (heap - 300MB) for execution+storage, GC tuning proportional to object count, and JVM limits at 200GB per executor.
Description
Spark uses a unified memory manager that divides JVM heap into regions for execution (shuffles, joins, sorts) and storage (cached RDDs). The key insight is that Spark reserves 300MB of heap for internal metadata and safety, then allocates 60% of the remainder as unified memory. Within this unified region, execution and storage can borrow from each other asymmetrically: execution can evict storage but storage cannot evict execution. Understanding this model is critical for tuning memory-intensive workloads.
Usage
Use these heuristics when you encounter OutOfMemoryError during Spark jobs, when tuning spark.memory.fraction and spark.memory.storageFraction, or when GC pauses are degrading performance. Also apply when deciding executor memory sizing for cluster configuration.
The Insight (Rule of Thumb)
- Action: Spark reserves 300MB off the top of each executor's heap, then allocates 60% of the remainder as unified memory.
- Value: `spark.memory.fraction` = 0.6 (default); `spark.memory.storageFraction` = 0.5 (default).
- Trade-off: Increasing `spark.memory.fraction` above 0.6 reduces the safety buffer for user data structures and internal metadata, risking OOM on sparse or unusually large records.
- Rule: For a 10GB executor, usable unified memory = (10GB - 300MB) * 0.6 = ~5.8GB. Of that, ~2.9GB defaults to storage.
- JVM Limit: Never allocate more than 200GB to a single JVM. Instead, run multiple executors per node.
- GC Strategy: Since Spark 4.0, G1GC is the default (was ParallelGC). With large heaps, increase `-XX:G1HeapRegionSize`.
- Eden Sizing: For HDFS workloads with 128MB blocks and 3-4 concurrent tasks: set Eden to ~1.5GB (4 * 3 * 128MB).
Reasoning
Java object overhead is substantial: a 10-character String consumes ~60 bytes (not 10), HashMap entries wrap each key-value in an object with 16-byte headers plus 8-byte pointers. This 2-5x memory amplification means GC pressure is proportional to object count, not data size. Serialized storage (using Kryo or compressed formats) dramatically reduces both memory footprint and GC overhead by storing data as byte arrays rather than deserialized objects.
The asymmetric borrowing rule (execution can evict storage, but not vice versa) is an intentional design trade-off: execution memory is harder to spill cleanly mid-operation, while cached blocks can be recomputed from their lineage. This means cache operations may silently fail if execution has consumed the storage region.
The 200GB JVM limit reflects empirical observations that JVM garbage collectors behave poorly with very large heaps, producing long pause times that can cause executor timeouts and task failures.
Code Evidence
From `docs/tuning.md:133-142`:
spark.memory.fraction = 0.6 (default)
- Size of M as fraction of (JVM heap - 300MB)
- Rest 40% reserved for user data structures, internal metadata, OOM safeguarding
spark.memory.storageFraction = 0.5 (default)
- Size of R as fraction of M
- Storage space where cached blocks immune to eviction by execution
Java object overhead from `docs/tuning.md:92-106`:
- Java object header: ~16 bytes per object
- String overhead: ~40 bytes + (2 bytes per character) for UTF-16 encoding
- HashMap/LinkedList: Wrapper object per entry with header + pointers (8 bytes each)
- Primitive boxing: Creates java.lang.Integer objects instead of raw ints
JVM size limit from `docs/hardware-provisioning.md:52-68`:
Recommendation: 8 GB to hundreds of GB per machine
Allocate at most 75% of memory to Spark (rest for OS/buffer cache)
Java VM behavior > 200 GB RAM:
- Launch multiple executors per node instead of single large executor