Implementation:Mage ai Mage ai Google Sheets Source

Knowledge Sources	Mage Integrations Google Sheets API
Domains	Data_Integration, Google_Sheets, API
Last Updated	2026-02-09 00:00 GMT

Overview

Source connector implementation for extracting data from Google Sheets spreadsheets within the Mage data integration framework.

Description

The GoogleSheets class extends the Mage Source base class to provide full-table replication from Google Sheets worksheets. It implements the three core source connector lifecycle methods: discover (introspects spreadsheet metadata to build a Singer catalog with one stream per worksheet, inferring JSON schemas from header rows and cell values), load_data (extracts worksheet data in configurable batches of up to 10,000 rows, fetching both formatted and unformatted value representations for type-accurate transformation), and test_connection (validates credentials by connecting to the Google Sheets API). Each worksheet becomes a stream with __google_sheet_row as the key property and FULL_TABLE replication method.

Usage

Instantiated by the Mage integration framework with a config containing path_to_credentials_json_file, spreadsheet_id, and optionally selected_sheet_names to filter which worksheets to sync.

Code Reference

Source Location

Repository: mage-ai
File: mage_integrations/mage_integrations/sources/google_sheets/__init__.py
Lines: 1-460

Signature

class GoogleSheets(Source):
    def __init__(self, **kwargs):
        ...
    def discover(self, streams: List[str] = None) -> Catalog:
        ...
    def load_data(self, stream, bookmarks: Dict = None, query: Dict = None, **kwargs) -> Generator[List[Dict], None, None]:
        ...
    def test_connection(self) -> None:
        ...

Import

from mage_integrations.sources.google_sheets import GoogleSheets

I/O Contract

Inputs

Name	Type	Required	Description
config.path_to_credentials_json_file	str	Yes	Path to Google service account credentials JSON file
config.spreadsheet_id	str	Yes	Google Sheets spreadsheet ID from the URL
config.selected_sheet_names	list[str]	No	Optional list of worksheet names to sync; syncs all if omitted

Outputs

Name	Type	Description
catalog	Catalog	Singer catalog with one CatalogEntry per worksheet, including inferred JSON schema
records	Generator[List[Dict]]	Batches of transformed row dictionaries with column names as keys

Key Behaviors

Discovery

Fetches spreadsheet metadata via the Google Sheets API to enumerate worksheets.
For each worksheet, reads header rows and first data rows to infer column types and build a JSON schema.
Creates a CatalogEntry with FULL_TABLE replication method, __google_sheet_row as key property, and UNIQUE_CONFLICT_METHOD_UPDATE for conflict resolution.

Data Loading

Paginates through worksheet rows in batches of 10,000 rows using A1 notation ranges (e.g., A2:Z10001).
Fetches both FORMATTED_VALUE and UNFORMATTED_VALUE representations for each batch to enable type-aware transformation.
Delegates row transformation to internal_transform.transform_sheet_data, which converts raw cell values to typed dictionaries.
Yields each batch as a list of dictionaries, stopping when a completely blank page is encountered.

Connection Testing

Calls self.connection.connect() to verify that the provided credentials are valid and can reach the Google Sheets API.

Usage Examples

from mage_integrations.sources.google_sheets import GoogleSheets

config = {
    "path_to_credentials_json_file": "/path/to/service_account.json",
    "spreadsheet_id": "1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgVE2upms",
    "selected_sheet_names": ["Sheet1", "Revenue"],
}
source = GoogleSheets(config=config)

# Discover available streams
catalog = source.discover()

# Load data from a stream
for batch in source.load_data(catalog.streams[0]):
    for record in batch:
        print(record)

Related Pages

Implements Principle

Principle:Mage_ai_Mage_ai_Source_Lifecycle_Orchestration

Requires Environment

Environment:Mage_ai_Mage_ai_Python_3_9_Runtime

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment