Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Mage ai Mage ai Dremio Source

From Leeroopedia


Knowledge Sources
Domains Data_Integration, Dremio, Source_Connector, SQL_Based
Last Updated 2026-02-09 00:00 GMT

Overview

Concrete tool for extracting data from Dremio data lakehouse provided by the Mage integrations source connector framework.

Description

The Dremio source connector extends the SQL-based Source class (mage_integrations.sources.sql.base.Source) to implement data extraction from Dremio. It connects through a DremioConnection wrapper that uses Apache Arrow Flight for high-performance data transfer. Discovery queries Dremio's information_schema.columns to enumerate tables and their column metadata (name, data type, nullability) within a configured schema. The connector maps SQL data types to Singer schema types: boolean, integer, numeric/decimal/float types, datetime/timestamp types (mapped to string with datetime format), JSON/variant types (mapped to object), UUID types, and all other types default to string. It supports multiple source backends (PostgreSQL, MySQL, MSSQL) with dedicated column type mapping functions for each. The test_connection() method executes a simple SELECT 1 query via the Flight client. Table names in queries are prefixed with the configured schema name. Bytes values in column metadata are automatically decoded to UTF-8. The replication method is full-table.

Usage

Use this source connector when building a Mage data pipeline that needs to extract data from Dremio. Configure with Dremio connection parameters and a schema to specify which schema's tables to discover. Optionally set source_backend to postgresql, mysql, or mssql to use backend-specific column type mappings.

Code Reference

Source Location

  • Repository: mage-ai
  • File: mage_integrations/mage_integrations/sources/dremio/__init__.py
  • Lines: 1-174

Signature

class Dremio(Source):
    def test_connection(self):
        ...
    def build_connection(self) -> DremioConnection:
        ...
    def build_discover_query(self, streams: List[str] = None):
        ...
    def column_type_mapping(self, column_type: str, column_format: str = None) -> str:
        ...
    def discover(self, streams: List[str] = None) -> Catalog:
        ...
    def build_table_name(self, stream) -> str:
        ...

Import

from mage_integrations.sources.dremio import Dremio

I/O Contract

Inputs

Name Type Required Description
config dict Yes Configuration dictionary with Dremio connection settings (passed directly to DremioConnection)
catalog Catalog No Singer catalog specifying streams to extract
state dict No Previous sync state for incremental extraction

Configuration Parameters

Name Type Required Description
schema str Yes Dremio schema name to discover tables from
source_backend str No Backend database type for column type mapping: postgresql, mysql, or mssql
(additional) various Yes All other config keys are passed directly to DremioConnection (host, port, username, password, etc.)

Outputs

Name Type Description
catalog Catalog Discovered tables with schemas from information_schema.columns (from discover())
records Generator[List[Dict]] Batches of extracted records via Arrow Flight queries (inherited from SQL base Source)

Usage Examples

from mage_integrations.sources.dremio import Dremio

config = {
    "host": "dremio.example.com",
    "port": 32010,
    "username": "admin",
    "password": "password123",
    "schema": "my_schema",
    "source_backend": "postgresql",
}

source = Dremio(config=config)

# Discover available streams
catalog = source.discover()

# Test connection
source.test_connection()

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment