Principle:Apache Flink Source Switching
| Knowledge Sources | |
|---|---|
| Domains | Stream_Processing, Source_Architecture |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
A coordinated source transition mechanism that waits for all readers to finish the current source before atomically switching all readers to the next source in the chain.
Description
Source Switching handles the coordination challenge of transitioning multiple parallel readers between sources. The enumerator collects SourceReaderFinishedEvent from all readers. Once all readers have finished the current source, it:
- Closes the current sources enumerator
- Creates the next source (optionally via SourceFactory with the previous enumerator for position handoff)
- Sends SwitchSourceEvent to all readers with the new source and whether it is the final source
- Readers create new underlying readers from the received source
The SourceSwitchContext enables position-based handoff: for example, a file source can pass its final offset to a Kafka source to start reading from the corresponding position.
Usage
This principle operates automatically. Users configure position handoff by providing a SourceFactory that receives the previous enumerator when adding sources to the chain.
Theoretical Basis
// Abstract source switching
function handleSourceEvent(subtaskId, event):
if event is SourceReaderFinishedEvent:
finishedReaders.add(subtaskId)
if finishedReaders.size == totalReaders:
switchEnumerator()
function switchEnumerator():
closeCurrentEnumerator()
nextSource = sourceFactory.create(SourceSwitchContext(previousEnumerator))
newEnumerator = nextSource.createEnumerator(contextProxy)
for each reader:
send SwitchSourceEvent(nextIndex, nextSource, isFinal)
newEnumerator.start()