Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Apache Flink FileWriter PrepareCommit

From Leeroopedia


Knowledge Sources
Domains Stream_Processing, Fault_Tolerance
Last Updated 2026-02-09 00:00 GMT

Overview

Concrete tool for collecting pending file committables across all active buckets as part of the two-phase commit protocol provided by the Apache Flink connector-files module.

Description

The FileWriter.prepareCommit method iterates over all active FileWriterBucket instances, removes inactive ones, and collects FileSinkCommittable objects from each active bucket. Each committable wraps either a pending file recoverable (a file ready to be finalized) or an in-progress file recoverable (for cleanup on recovery). When endOfInput is true, all in-progress files are forcefully closed.

Usage

This is an internal method invoked by the Flink runtime during checkpoint preparation. Users do not call it directly.

Code Reference

Source Location

  • Repository: Apache Flink
  • File: flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/sink/writer/FileWriter.java
  • Lines: L208-226

Signature

@Override
public Collection<FileSinkCommittable> prepareCommit() throws IOException {
    List<FileSinkCommittable> committables = new ArrayList<>();
    Iterator<Map.Entry<String, FileWriterBucket<IN>>> activeBucketIt =
            activeBuckets.entrySet().iterator();
    while (activeBucketIt.hasNext()) {
        Map.Entry<String, FileWriterBucket<IN>> entry = activeBucketIt.next();
        if (!entry.getValue().isActive()) {
            activeBucketIt.remove();
        } else {
            committables.addAll(entry.getValue().prepareCommit(endOfInput));
        }
    }
    return committables;
}

Import

import org.apache.flink.connector.file.sink.writer.FileWriter;
import org.apache.flink.connector.file.sink.FileSinkCommittable;
// Internal class

I/O Contract

Inputs

Name Type Required Description
activeBuckets Map<String, FileWriterBucket<IN>> Yes Currently active bucket writers
endOfInput boolean Yes Whether this is the final flush

Outputs

Name Type Description
committables Collection<FileSinkCommittable> Pending and in-progress file recoverables for commit

Usage Examples

Checkpoint Flow

// During checkpoint, the Flink runtime calls:
// 1. fileWriter.flush(endOfInput)     -- sets endOfInput flag
// 2. fileWriter.prepareCommit()       -- collects committables
// 3. committables sent to FileCommitter.commit() after checkpoint succeeds
//
// Each FileSinkCommittable contains:
// - bucketId (String)
// - pendingFile (PendingFileRecoverable) -- file ready to finalize
// - inProgressFileToCleanup (InProgressFileRecoverable) -- for recovery cleanup

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment