Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Apache Flink FileSource ForRecordStreamFormat

From Leeroopedia


Knowledge Sources
Domains Stream_Processing, File_IO
Last Updated 2026-02-09 00:00 GMT

Overview

Concrete tool for constructing file source connectors with stream-wise or bulk format readers provided by the Apache Flink connector-files module.

Description

The FileSource class provides two static factory methods. forRecordStreamFormat accepts a StreamFormat<T> and input paths, internally adapting it to a BulkFormat via StreamFormatAdapter. forBulkFileFormat accepts a BulkFormat<T, FileSourceSplit> directly. Both return a FileSourceBuilder that auto-selects the file enumerator based on format splittability and defaults to LocalityAwareSplitAssigner.

Usage

Import these factory methods to create file sources for Flink pipelines. Use forRecordStreamFormat for text files and forBulkFileFormat for Parquet/ORC.

Code Reference

Source Location

  • Repository: Apache Flink
  • File: flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/src/FileSource.java
  • Lines: L181-199

Signature

@PublicEvolving
public final class FileSource<T> extends AbstractFileSource<T, FileSourceSplit>
        implements DynamicParallelismInference {

    public static <T> FileSourceBuilder<T> forRecordStreamFormat(
            final StreamFormat<T> streamFormat, final Path... paths) {
        return forBulkFileFormat(new StreamFormatAdapter<>(streamFormat), paths);
    }

    public static <T> FileSourceBuilder<T> forBulkFileFormat(
            final BulkFormat<T, FileSourceSplit> bulkFormat, final Path... paths) {
        checkNotNull(bulkFormat, "reader");
        checkNotNull(paths, "paths");
        checkArgument(paths.length > 0, "paths must not be empty");
        return new FileSourceBuilder<>(paths, bulkFormat);
    }
}

Import

import org.apache.flink.connector.file.src.FileSource;
import org.apache.flink.connector.file.src.reader.StreamFormat;
import org.apache.flink.connector.file.src.reader.BulkFormat;
import org.apache.flink.core.fs.Path;

I/O Contract

Inputs

Name Type Required Description
streamFormat StreamFormat<T> Yes (stream) Record-by-record reader
bulkFormat BulkFormat<T, FileSourceSplit> Yes (bulk) Batch reader (e.g., Parquet)
paths Path... Yes Input file or directory paths

Outputs

Name Type Description
builder FileSourceBuilder<T> Configurable builder supporting .monitorContinuously(), .processStaticFileSet(), .build()
FileSource<T> FileSource<T> Final source instance (after .build())

Usage Examples

Reading Text Files

import org.apache.flink.connector.file.src.FileSource;
import org.apache.flink.formats.text.TextLineInputFormat;
import org.apache.flink.core.fs.Path;

FileSource<String> source = FileSource
    .forRecordStreamFormat(new TextLineInputFormat(), new Path("/input/logs/"))
    .monitorContinuously(Duration.ofSeconds(30))
    .build();

DataStream<String> stream = env.fromSource(source, WatermarkStrategy.noWatermarks(), "file-source");

Reading Parquet Files

import org.apache.flink.connector.file.src.FileSource;
import org.apache.flink.formats.parquet.avro.AvroParquetReaders;

FileSource<GenericRecord> source = FileSource
    .forBulkFileFormat(AvroParquetReaders.forGenericRecord(schema), new Path("/input/parquet/"))
    .processStaticFileSet()
    .build();

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment