Principle:Mage ai Mage ai API Stream Discovery

Knowledge Sources	Singer Discovery Mode Mage Integrations
Domains	Data_Integration, API, Schema_Management
Last Updated	2026-02-09 00:00 GMT

Overview

A file-based schema discovery mechanism that loads JSON Schema definitions from a local schemas directory to build a Singer catalog for API source connectors.

Description

API Stream Discovery provides catalog generation for non-SQL sources where schemas cannot be introspected from a database. Instead, each stream's schema is defined as a JSON Schema Draft 4 file in a schemas/ directory within the connector package. The discovery process scans this directory, parses each JSON file into a Singer Schema object, and builds CatalogEntry instances with metadata including key_properties, replication_method, and valid_replication_keys derived from the connector's method overrides.

Usage

Use this principle for API-based, file-based, or any non-SQL source connectors. Schemas are defined statically (committed to the repo) rather than discovered dynamically. Some connectors may override discover() to add dynamic schema discovery on top of static schemas.

Theoretical Basis

The file-based discovery algorithm:

Locate schemas/ directory relative to the connector's __init__.py
For each .json file in the directory, load as JSON Schema
For each schema, call build_catalog_entry() which:
- Queries get_table_key_properties(stream_id) for primary keys
- Queries get_forced_replication_method(stream_id) for replication strategy
- Queries get_valid_replication_keys(stream_id) for bookmark columns
- Generates standard Singer metadata via get_standard_metadata()
Return Catalog containing all CatalogEntry objects

Related Pages

Implemented By

Implementation:Mage_ai_Mage_ai_Source_Discover

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment