Implementation:Apache Flink SplitReader
| Knowledge Sources | |
|---|---|
| Domains | Connectors, Table_API |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
An interface for reading records from source splits in Flink's connector base framework.
Description
SplitReader is a generic interface in the flink-connector-base module that defines how records are fetched from source splits. It is parameterized by an element type E and a split type SplitT (which must extend SourceSplit). The interface extends AutoCloseable and serves as the low-level reading abstraction within Flink's SourceReaderBase architecture. Implementations may read from a single split or from multiple splits concurrently.
The interface defines four key operations: fetch() for blocking retrieval of records, handleSplitsChanges() for non-blocking split assignment updates, wakeUp() for interrupting a blocking fetch call, and pauseOrResumeSplits() for watermark alignment across splits. The @PublicEvolving annotation indicates this API is stable but may evolve across minor releases.
Usage
Connector developers implement this interface when building a custom Flink source connector using the SourceReaderBase framework. It is the component responsible for the actual data reading from an external system (e.g., a message queue partition or file split). The implementation is typically created by a Supplier<SplitReader> and used inside a SplitFetcherManager which manages fetcher threads.
Code Reference
Source Location
- Repository: Apache_Flink
- File: flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/reader/splitreader/SplitReader.java
- Lines: 1-101
Signature
@PublicEvolving
public interface SplitReader<E, SplitT extends SourceSplit> extends AutoCloseable
Import
import org.apache.flink.connector.base.source.reader.splitreader.SplitReader;
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| splitsChanges | SplitsChange<SplitT> | Yes | Split changes (additions or removals) to be handled by the reader via handleSplitsChanges(). |
| splitsToPause | Collection<SplitT> | No | Collection of splits to pause reading from, used in pauseOrResumeSplits() for watermark alignment. |
| splitsToResume | Collection<SplitT> | No | Collection of splits to resume reading from, used in pauseOrResumeSplits() for watermark alignment. |
Outputs
| Name | Type | Description |
|---|---|---|
| fetch() return | RecordsWithSplitIds<E> | A batch of records along with their associated split IDs, including IDs of any finished splits. |
Usage Examples
// Implementing a custom SplitReader for a hypothetical source
public class MySourceSplitReader implements SplitReader<MyRecord, MySourceSplit> {
private final Queue<MySourceSplit> assignedSplits = new ArrayDeque<>();
@Override
public RecordsWithSplitIds<MyRecord> fetch() throws IOException {
// Read records from assigned splits
// This method may block until records are available or wakeUp() is called
MySourceSplit currentSplit = assignedSplits.peek();
if (currentSplit == null) {
return new RecordsBySplits<>(Collections.emptyMap(), Collections.emptySet());
}
List<MyRecord> records = currentSplit.readNextBatch();
// Return records associated with the split ID
return RecordsBySplits.forRecords(currentSplit.splitId(), records);
}
@Override
public void handleSplitsChanges(SplitsChange<MySourceSplit> splitsChanges) {
if (splitsChanges instanceof SplitsAddition) {
assignedSplits.addAll(splitsChanges.splits());
}
}
@Override
public void wakeUp() {
// Interrupt any blocking operation in fetch()
}
@Override
public void close() throws Exception {
// Clean up resources
assignedSplits.clear();
}
}
Related Pages
- Principle:Apache_Flink_Source_Connector_Framework
- Apache_Flink_SplitsChange - Base class for split change events handled by this reader
- Apache_Flink_SplitsAddition - Concrete split addition event
- Apache_Flink_SplitsRemoval - Concrete split removal event