Implementation:Apache Flink FileWriter PrepareCommit
| Knowledge Sources | |
|---|---|
| Domains | Stream_Processing, Fault_Tolerance |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete tool for collecting pending file committables across all active buckets as part of the two-phase commit protocol provided by the Apache Flink connector-files module.
Description
The FileWriter.prepareCommit method iterates over all active FileWriterBucket instances, removes inactive ones, and collects FileSinkCommittable objects from each active bucket. Each committable wraps either a pending file recoverable (a file ready to be finalized) or an in-progress file recoverable (for cleanup on recovery). When endOfInput is true, all in-progress files are forcefully closed.
Usage
This is an internal method invoked by the Flink runtime during checkpoint preparation. Users do not call it directly.
Code Reference
Source Location
- Repository: Apache Flink
- File: flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/sink/writer/FileWriter.java
- Lines: L208-226
Signature
@Override
public Collection<FileSinkCommittable> prepareCommit() throws IOException {
List<FileSinkCommittable> committables = new ArrayList<>();
Iterator<Map.Entry<String, FileWriterBucket<IN>>> activeBucketIt =
activeBuckets.entrySet().iterator();
while (activeBucketIt.hasNext()) {
Map.Entry<String, FileWriterBucket<IN>> entry = activeBucketIt.next();
if (!entry.getValue().isActive()) {
activeBucketIt.remove();
} else {
committables.addAll(entry.getValue().prepareCommit(endOfInput));
}
}
return committables;
}
Import
import org.apache.flink.connector.file.sink.writer.FileWriter;
import org.apache.flink.connector.file.sink.FileSinkCommittable;
// Internal class
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| activeBuckets | Map<String, FileWriterBucket<IN>> | Yes | Currently active bucket writers |
| endOfInput | boolean | Yes | Whether this is the final flush |
Outputs
| Name | Type | Description |
|---|---|---|
| committables | Collection<FileSinkCommittable> | Pending and in-progress file recoverables for commit |
Usage Examples
Checkpoint Flow
// During checkpoint, the Flink runtime calls:
// 1. fileWriter.flush(endOfInput) -- sets endOfInput flag
// 2. fileWriter.prepareCommit() -- collects committables
// 3. committables sent to FileCommitter.commit() after checkpoint succeeds
//
// Each FileSinkCommittable contains:
// - bucketId (String)
// - pendingFile (PendingFileRecoverable) -- file ready to finalize
// - inProgressFileToCleanup (InProgressFileRecoverable) -- for recovery cleanup