Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Apache Flink FileSourceReader Constructor

From Leeroopedia


Knowledge Sources
Domains Stream_Processing, File_IO
Last Updated 2026-02-09 00:00 GMT

Overview

Concrete tool for reading records from file splits using the single-threaded multiplexed reader pattern provided by the Apache Flink connector-files module.

Description

The FileSourceReader extends SingleThreadMultiplexSourceReaderBase to read file splits. Its constructor wires together FileSourceSplitReader (for file I/O via BulkFormat.Reader) and FileSourceRecordEmitter (for emitting records and tracking offsets). On startup, if no splits are assigned, it immediately requests one. When a split finishes, it requests the next. The reader is generic over split types via SplitT extends FileSourceSplit.

Usage

This is an internal class created by FileSource.createReader(). Users interact with it indirectly through the DataStream API.

Code Reference

Source Location

  • Repository: Apache Flink
  • File: flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/impl/FileSourceReader.java
  • Lines: L35-72

Signature

@Internal
public final class FileSourceReader<T, SplitT extends FileSourceSplit>
        extends SingleThreadMultiplexSourceReaderBase<
                RecordAndPosition<T>, T, SplitT, FileSourceSplitState<SplitT>> {

    public FileSourceReader(
            SourceReaderContext readerContext,
            BulkFormat<T, SplitT> readerFormat,
            Configuration config) {
        super(
                () -> new FileSourceSplitReader<>(config, readerFormat),
                new FileSourceRecordEmitter<>(),
                config,
                readerContext);
    }

    @Override
    public void start() {
        if (getNumberOfCurrentlyAssignedSplits() == 0) {
            context.sendSplitRequest();
        }
    }

    @Override
    protected void onSplitFinished(Map<String, FileSourceSplitState<SplitT>> finishedSplitIds) {
        context.sendSplitRequest();
    }

    @Override
    protected FileSourceSplitState<SplitT> initializedState(SplitT split) {
        return new FileSourceSplitState<>(split);
    }

    @Override
    protected SplitT toSplitType(String splitId, FileSourceSplitState<SplitT> splitState) {
        return splitState.toFileSourceSplit();
    }
}

Import

import org.apache.flink.connector.file.src.impl.FileSourceReader;
// Internal class

I/O Contract

Inputs

Name Type Required Description
readerContext SourceReaderContext Yes Runtime context for split requests
readerFormat BulkFormat<T, SplitT> Yes Format for reading records from splits
config Configuration Yes Reader configuration

Outputs

Name Type Description
records T Individual records emitted downstream
split requests SourceEvent Requests for new splits sent to enumerator

Usage Examples

Reader Lifecycle

// FileSourceReader lifecycle:
// 1. start() -> sends split request if no splits assigned
// 2. Receives split from enumerator
// 3. FileSourceSplitReader.fetch() reads batches from the split
// 4. FileSourceRecordEmitter emits records and updates offset state
// 5. onSplitFinished() -> requests next split
// 6. Repeat until no more splits

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment