Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Mage ai Mage ai Parallel Sink Draining

From Leeroopedia


Knowledge Sources
Domains Data_Integration, Parallel_Processing, Performance
Last Updated 2026-02-09 00:00 GMT

Overview

A parallel batch processing mechanism that drains multiple Singer target sinks concurrently using thread-based parallelism for improved throughput.

Description

Parallel Sink Draining enables destination connectors using the Singer SDK Target architecture to process multiple stream sinks concurrently. When a drain operation is triggered (batch full, end of pipe, or max record age exceeded), all active sinks are drained in parallel using Python's joblib with threading backend. Cleared sinks (from schema changes) are drained sequentially first, then active sinks are drained in parallel up to max_parallelism (default 8 threads). After draining, state is written and cleanup runs on end-of-pipe.

Usage

Use this principle when building destination connectors that need to handle multiple streams concurrently. Applicable to the Singer SDK Target architecture (not the Mage-native Destination architecture which handles batching differently).

Theoretical Basis

The parallel drain algorithm:

  1. Deep copy current state
  2. Drain cleared sinks sequentially (schema changes require ordered processing)
  3. If end-of-pipe: clean up cleared sinks
  4. Drain active sinks in parallel (up to max_parallelism threads via joblib)
  5. If end-of-pipe: clean up active sinks
  6. Write state message
  7. Reset max record age timer

Each sink drain: start_drain() -> process_batch(context) -> mark_drained()

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment