Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Apache Flink Committable Collection Coordination

From Leeroopedia


Knowledge Sources
Domains Stream_Processing, Coordination
Last Updated 2026-02-09 00:00 GMT

Overview

A single-parallelism coordination operator that collects file committables from all writers and groups them into per-bucket compaction requests.

Description

Committable Collection Coordination runs at parallelism 1 to provide a global view of all pending files across all writer subtasks. It receives CommittableMessage<FileSinkCommittable> from the writers and sorts them into per-bucket groups:

  • Files to compact: Pending files that have been finalized by the writer
  • Files to pass through: In-progress file cleanups and compacted file cleanups

An internal CompactTrigger tracks checkpoint count and accumulated size per bucket. When the trigger fires, the accumulated files are emitted as a CompactorRequest.

Usage

This principle is internal to the compaction pipeline. It is automatically inserted into the sink topology when compaction is enabled via FileSink.RowFormatBuilder.enableCompact().

Theoretical Basis

// Abstract coordination
function processElement(committableMessage):
    committable = message.getCommittable()
    bucketId = committable.getBucketId()
    if committable.hasPendingFile():
        trigger.addToCompact(bucketId, committable)
    else:
        trigger.addToPassthrough(bucketId, committable)

    if trigger.shouldFire(bucketId):
        request = trigger.createRequest(bucketId)
        emit(request)

Related Pages

Implemented By

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment