Heuristic:Apache Hudi Compaction Scheduling Safety
| Knowledge Sources | |
|---|---|
| Domains | Optimization, Data_Integrity |
| Last Updated | 2026-02-08 20:00 GMT |
Overview
Compaction plans should be scheduled by the Hudi writer job, not by standalone compaction jobs, to prevent data loss risk.
Description
Apache Hudi MOR (Merge-on-Read) tables require periodic compaction to merge delta log files into base Parquet files. The compaction plan (which log files to compact) can be scheduled either inline within the streaming writer job or externally by a standalone compaction job. The Hudi codebase explicitly warns that scheduling compaction outside the writer job carries a risk of data loss, because the external scheduler may not have a consistent view of in-flight writes. The recommended pattern is to let the writer job schedule compaction plans and use the standalone compaction job only for execution of those plans.
Usage
Apply this heuristic when configuring MOR table compaction in Flink. If you are running a standalone HoodieFlinkCompactor, set --schedule false (the default) and rely on the writer job to schedule compaction plans. Only enable --schedule true if you understand and accept the data loss risk.
The Insight (Rule of Thumb)
- Action: Keep
--schedule false(default) on standalone compaction jobs. Let the writer job handle scheduling. - Value: Writer job uses
compaction.delta_commits=5(default) to schedule compaction after every 5 commits. - Trade-off: Inline scheduling adds slight overhead to the writer job but guarantees consistency with in-flight writes.
- Exception: If you must schedule externally, also set
--job-max-processing-time-msfor the retry mechanism to function (otherwise--retry-last-failed-jobis silently ineffective).
Reasoning
The writer job has exclusive knowledge of which commits are in-flight and which log files are being actively written. A standalone compaction scheduler lacks this view and may schedule compaction of files that are still being written, leading to data corruption or loss. By co-locating scheduling with the writer, the compaction plan is generated atomically with the write commit, ensuring consistency.
Additionally, both the compaction and clustering standalone jobs have a --retry-last-failed-job flag that silently does nothing unless --job-max-processing-time-ms is set to a positive value, creating a configuration trap where retries appear enabled but are inactive.
Code Evidence
Compaction scheduling warning from FlinkCompactionConfig.java:122-126:
@Parameter(names = {"--schedule", "-sc"}, description = "Not recommended. Schedule the compaction plan in this job.\n"
+ "There is a risk of losing data when scheduling compaction outside the writer job.\n"
+ "Scheduling compaction in the writer job and only let this job do the compaction execution is recommended.\n"
+ "Default is false")
public Boolean schedule = false;
Retry mechanism warning from HoodieFlinkCompactor.java:80-82:
LOG.warn("--retry-last-failed-job is enabled but --job-max-processing-time-ms is not set or <= 0. "
+ "The retry-last-failed feature will have no effect.");
Compaction trigger defaults from FlinkOptions.java:937-941:
public static final ConfigOption<Integer> COMPACTION_DELTA_COMMITS = ConfigOptions
.key("compaction.delta_commits")
.intType()
.defaultValue(5)
.withDescription("Max delta commits needed to trigger compaction, default 5 commits");