Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Apache Hudi Clustering Plan Generation

From Leeroopedia


Knowledge Sources
Domains Data_Lake, Data_Layout_Optimization
Last Updated 2026-02-08 00:00 GMT

Overview

Generating a concrete clustering execution plan by reading pending clustering instants from the timeline and decomposing them into per-group execution events for downstream operators.

Description

Once clustering has been validated and a clustering plan has been scheduled onto the Hudi timeline as a REQUESTED instant, the next step is to read that plan and convert it into actionable work units. This is the plan generation phase, which bridges the gap between a timeline-level scheduling decision and the actual parallel execution of clustering tasks.

A clustering plan (represented by HoodieClusteringPlan) contains one or more input groups (HoodieClusteringGroup), each describing a set of file slices that should be rewritten together. The plan generation phase performs the following steps:

  1. Discover pending plans: Query the Hudi active timeline for REQUESTED clustering instants. The first pending instant (FIFO order) is selected for execution.
  2. Read the plan: Deserialize the HoodieClusteringPlan from the timeline instant metadata.
  3. Transition to inflight: Mark the instant as INFLIGHT on the timeline to prevent concurrent execution.
  4. Decompose into events: For each HoodieClusteringGroup in the plan, create a ClusteringPlanEvent containing the group info, strategy parameters, and instant time.
  5. Emit events downstream: Send each ClusteringPlanEvent to the next operator in the Flink pipeline for parallel execution.

This decomposition enables Flink's parallelism model: each clustering group can be processed independently by a different task slot.

Usage

This principle applies in two distinct execution modes:

  1. Online (streaming) mode: The ClusteringPlanOperator triggers plan generation on every completed Flink checkpoint. It acts as a singleton operator that checks the timeline for pending plans and emits events.
  2. Batch (offline) mode: The ClusteringPlanSourceFunction is used as a Flink source that reads a pre-scheduled clustering plan and emits events. This is used by HoodieFlinkClusteringJob.

In both modes, the operator must be a single-parallelism task to avoid race conditions on timeline transitions.

Theoretical Basis

The plan generation step implements a work decomposition pattern common in distributed data processing systems. A centralized coordinator (the plan operator) reads a global state (the timeline) and partitions work into independent units that can be executed in parallel.

FUNCTION generateClusteringEvents(timeline):
    pendingInstants = timeline.getRequestedClusteringInstants()
    IF pendingInstants.isEmpty():
        RETURN  // nothing to do

    instant = pendingInstants.first()  // FIFO ordering
    plan = timeline.getClusteringPlan(instant)

    IF plan IS NULL OR plan.inputGroups IS EMPTY:
        RETURN  // empty plan

    timeline.transitionToInflight(instant)

    FOR EACH group IN plan.inputGroups:
        groupInfo = ClusteringGroupInfo.create(group)
        event = new ClusteringPlanEvent(instant.time, groupInfo, plan.strategyParams)
        EMIT event downstream

The transition to INFLIGHT state before emitting events provides exactly-once scheduling semantics: if the operator fails after transition but before emitting all events, the inflight instant is rolled back on restart (via ClusteringUtil.rollbackClustering), allowing the plan to be re-read and re-emitted.

The FIFO ordering of pending instants ensures that older plans are executed before newer ones, which is important for maintaining timeline consistency.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment