Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Apache Flink RecordsWithSplitIds

From Leeroopedia


Knowledge Sources
Domains Connectors, Source_Framework
Last Updated 2026-02-09 00:00 GMT

Overview

An interface defining the contract for batches of records passed from split fetcher threads to the source reader, organized by split identifiers.

Description

RecordsWithSplitIds is a key interface in Flink's source connector framework that defines how fetched records are transferred from the I/O fetcher threads to the main source reader thread. It provides an iterator-like API where the consumer first advances to the next split via nextSplit(), then iterates through records within that split via nextRecordFromSplit(). The interface also reports which splits have been fully consumed via finishedSplits().

The interface includes an optional recycle() default method that is called after all records from a batch have been emitted. This gives implementations an opportunity to recycle or reuse the object, which is an important performance optimization when record objects are large or expensive to allocate. The interface is annotated with @PublicEvolving, indicating it is part of Flink's public API but may evolve across minor versions.

Usage

Connector developers encounter this interface as the return type of SplitReader.fetch(). When implementing a SplitReader, the fetch() method must return a RecordsWithSplitIds instance containing the records read from the external system. The standard implementation RecordsBySplits is typically used via its builder. Advanced connector developers may provide custom implementations for performance optimization, for example to support object pooling or zero-copy record transfer via the recycle() method.

Code Reference

Source Location

  • Repository: Apache_Flink
  • File: flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/reader/RecordsWithSplitIds.java
  • Lines: 1-61

Signature

@PublicEvolving
public interface RecordsWithSplitIds<E> {

    /**
     * Moves to the next split. This method is also called initially to move to the first split.
     * Returns null, if no splits are left.
     */
    @Nullable
    String nextSplit();

    /**
     * Gets the next record from the current split.
     * Returns null if no more records are left in this split.
     */
    @Nullable
    E nextRecordFromSplit();

    /**
     * Get the finished splits.
     *
     * @return the finished splits after this RecordsWithSplitIds is returned.
     */
    Set<String> finishedSplits();

    /**
     * Called when all records from this batch have been emitted.
     * Gives implementations the opportunity to recycle/reuse this object.
     */
    default void recycle() {}
}

Import

import org.apache.flink.connector.base.source.reader.RecordsWithSplitIds;

I/O Contract

Inputs

Name Type Required Description
(none) - - This interface is a data container. It is populated by the SplitReader and consumed by the SourceReaderBase.

Outputs

Name Type Description
nextSplit() String (nullable) Returns the ID of the next split containing records, or null when all splits have been iterated.
nextRecordFromSplit() E (nullable) Returns the next record from the current split, or null when no more records remain in the current split.
finishedSplits() Set<String> Returns the set of split IDs that have been fully consumed and should be reported as finished.

Usage Examples

// Example: Consuming records from a RecordsWithSplitIds instance
// (This pattern is used internally by SourceReaderBase)
RecordsWithSplitIds<RawRecord> fetchResult = splitReader.fetch();

String splitId;
while ((splitId = fetchResult.nextSplit()) != null) {
    RawRecord record;
    while ((record = fetchResult.nextRecordFromSplit()) != null) {
        // Process each record: transform and emit downstream
        recordEmitter.emitRecord(record, output, splitStates.get(splitId));
    }
}

// Handle finished splits
Set<String> finished = fetchResult.finishedSplits();
if (!finished.isEmpty()) {
    onSplitFinished(finished);
}

// Allow the batch to be recycled
fetchResult.recycle();

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment