Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Apache Flink SplitReader

From Leeroopedia


Knowledge Sources
Domains Connectors, Table_API
Last Updated 2026-02-09 00:00 GMT

Overview

An interface for reading records from source splits in Flink's connector base framework.

Description

SplitReader is a generic interface in the flink-connector-base module that defines how records are fetched from source splits. It is parameterized by an element type E and a split type SplitT (which must extend SourceSplit). The interface extends AutoCloseable and serves as the low-level reading abstraction within Flink's SourceReaderBase architecture. Implementations may read from a single split or from multiple splits concurrently.

The interface defines four key operations: fetch() for blocking retrieval of records, handleSplitsChanges() for non-blocking split assignment updates, wakeUp() for interrupting a blocking fetch call, and pauseOrResumeSplits() for watermark alignment across splits. The @PublicEvolving annotation indicates this API is stable but may evolve across minor releases.

Usage

Connector developers implement this interface when building a custom Flink source connector using the SourceReaderBase framework. It is the component responsible for the actual data reading from an external system (e.g., a message queue partition or file split). The implementation is typically created by a Supplier<SplitReader> and used inside a SplitFetcherManager which manages fetcher threads.

Code Reference

Source Location

  • Repository: Apache_Flink
  • File: flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/reader/splitreader/SplitReader.java
  • Lines: 1-101

Signature

@PublicEvolving
public interface SplitReader<E, SplitT extends SourceSplit> extends AutoCloseable

Import

import org.apache.flink.connector.base.source.reader.splitreader.SplitReader;

I/O Contract

Inputs

Name Type Required Description
splitsChanges SplitsChange<SplitT> Yes Split changes (additions or removals) to be handled by the reader via handleSplitsChanges().
splitsToPause Collection<SplitT> No Collection of splits to pause reading from, used in pauseOrResumeSplits() for watermark alignment.
splitsToResume Collection<SplitT> No Collection of splits to resume reading from, used in pauseOrResumeSplits() for watermark alignment.

Outputs

Name Type Description
fetch() return RecordsWithSplitIds<E> A batch of records along with their associated split IDs, including IDs of any finished splits.

Usage Examples

// Implementing a custom SplitReader for a hypothetical source
public class MySourceSplitReader implements SplitReader<MyRecord, MySourceSplit> {

    private final Queue<MySourceSplit> assignedSplits = new ArrayDeque<>();

    @Override
    public RecordsWithSplitIds<MyRecord> fetch() throws IOException {
        // Read records from assigned splits
        // This method may block until records are available or wakeUp() is called
        MySourceSplit currentSplit = assignedSplits.peek();
        if (currentSplit == null) {
            return new RecordsBySplits<>(Collections.emptyMap(), Collections.emptySet());
        }
        List<MyRecord> records = currentSplit.readNextBatch();
        // Return records associated with the split ID
        return RecordsBySplits.forRecords(currentSplit.splitId(), records);
    }

    @Override
    public void handleSplitsChanges(SplitsChange<MySourceSplit> splitsChanges) {
        if (splitsChanges instanceof SplitsAddition) {
            assignedSplits.addAll(splitsChanges.splits());
        }
    }

    @Override
    public void wakeUp() {
        // Interrupt any blocking operation in fetch()
    }

    @Override
    public void close() throws Exception {
        // Clean up resources
        assignedSplits.clear();
    }
}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment