Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Apache Flink LocalityAwareSplitAssigner GetNext

From Leeroopedia
Revision as of 14:17, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Apache_Flink_LocalityAwareSplitAssigner_GetNext.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Distributed_Computing, Performance_Optimization
Last Updated 2026-02-09 00:00 GMT

Overview

Concrete tool for assigning file splits to readers with preference for data-local assignments provided by the Apache Flink connector-files module.

Description

The LocalityAwareSplitAssigner implements FileSplitAssigner and maintains a pool of unassigned splits. When getNext is called with a hostname, it checks if any splits have data blocks on that host. Local splits are selected using a least-local-count strategy (preferring splits with fewer local assignments) to avoid starvation. If no local split is available, a remote split is chosen via round-robin. The assigner tracks local vs. remote assignment metrics.

Usage

This is the default split assigner for FileSource. No user configuration is needed.

Code Reference

Source Location

  • Repository: Apache Flink
  • File: flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/assigners/LocalityAwareSplitAssigner.java
  • Lines: L54-362

Signature

@PublicEvolving
public class LocalityAwareSplitAssigner implements FileSplitAssigner {

    public LocalityAwareSplitAssigner(Collection<FileSourceSplit> splits);

    @Override
    public Optional<FileSourceSplit> getNext(@Nullable String host);

    @Override
    public void addSplits(Collection<FileSourceSplit> splits);
}

Import

import org.apache.flink.connector.file.src.assigners.LocalityAwareSplitAssigner;

I/O Contract

Inputs

Name Type Required Description
host String (nullable) No Requesting reader's hostname for locality matching

Outputs

Name Type Description
split Optional<FileSourceSplit> Next split to process (local-preferred) or empty if none available

Usage Examples

Split Assignment Flow

// The enumerator calls the assigner when a reader requests work:
// 1. Reader on host "node1" requests a split
// 2. Assigner checks for splits with blocks on "node1"
// 3. If found -> assigns local split (increments localAssignments counter)
// 4. If not found -> assigns any available split (increments remoteAssignments counter)

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment