Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Mage ai Mage ai Catalog Selection

From Leeroopedia


Knowledge Sources
Domains Data_Integration, ETL
Last Updated 2026-02-09 00:00 GMT

Overview

A catalog configuration mechanism that marks streams and columns for inclusion/exclusion in a Singer sync operation using metadata-driven selection flags.

Description

Catalog Selection bridges the gap between schema discovery and data extraction. After a tap discovers all available streams, this principle allows users to select which streams to sync, which columns to include, what replication method to use (FULL_TABLE or INCREMENTAL), and which columns serve as primary keys or bookmark properties. Selection is encoded in Singer metadata entries using the "selected" and "inclusion" fields (automatic, available, unsupported).

Usage

Apply this principle after schema discovery and before starting a sync. It is essential whenever users need to customize which data is extracted rather than syncing everything discovered.

Theoretical Basis

Singer metadata uses a breadcrumb-based system:

  • Stream-level metadata (breadcrumb=[]) contains the "selected" flag for the entire stream
  • Column-level metadata (breadcrumb=["properties", "column_name"]) contains per-column selection
  • Inclusion rules:
    • "automatic" - column is always included (key properties, replication keys)
    • "available" - column can be selected or deselected
    • "unsupported" - column cannot be synced

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment