Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:TobikoData Sqlmesh BigQuery Connection

From Leeroopedia


Knowledge Sources
Domains BigQuery, Google Cloud, Data Warehouse
Last Updated 2026-02-07 21:00 GMT

Overview

BigQuery engine adapter connection environment enabling SQLMesh to execute transformations on Google Cloud's BigQuery data warehouse.

Description

The BigQuery connection environment integrates SQLMesh with Google Cloud BigQuery through the official Python client libraries. It supports multiple authentication methods including OAuth, service accounts, and service account JSON. The environment includes pandas integration for data transfers, BigQuery Storage API for optimized reads, and optional BigFrames support for DataFrame operations. Configuration includes job timeout controls and retry logic for robust execution.

Usage

This environment is required when using BigQuery as the execution engine for SQLMesh models, for state synchronization stored in BigQuery, or when running engine-specific tests against BigQuery. Supports service account impersonation and fine-grained job execution controls.

System Requirements

Category Requirement Notes
Network Access to Google Cloud APIs HTTPS connectivity to googleapis.com
Authentication Google Cloud credentials Service account or OAuth
GCP Project Active GCP project with billing BigQuery API must be enabled
Permissions BigQuery Data Editor + Job User Minimum required permissions

Dependencies

System Packages

  • System CA certificates for HTTPS
  • gRPC libraries (installed via Python packages)

Python Packages

  • google-cloud-bigquery[pandas] - Official BigQuery client with pandas integration
  • google-cloud-bigquery-storage - BigQuery Storage API for optimized data transfer
  • bigframes>=1.32.0 - Optional BigFrames support (separate install due to SQLGlot version conflicts)

Credentials

Primary authentication methods (choose one):

  • GOOGLE_APPLICATION_CREDENTIALS - Path to service account JSON key file
  • BIGQUERY_KEYFILE - Alternative name for service account key file path

Authentication types supported in BigQueryConnectionConfig:

  • OAUTH - Interactive OAuth flow
  • SERVICE_ACCOUNT - Service account email + key file
  • SERVICE_ACCOUNT_JSON - Service account credentials as JSON string
  • OAUTH_SECRETS - OAuth client secrets

Optional configuration:

  • BIGQUERY_PROJECT - Default GCP project ID
  • BIGQUERY_LOCATION - BigQuery location/region (e.g., US, EU)
  • BIGQUERY_IMPERSONATED_SERVICE_ACCOUNT - Service account to impersonate
  • BIGQUERY_JOB_CREATION_TIMEOUT - Timeout for job creation (seconds)
  • BIGQUERY_JOB_EXECUTION_TIMEOUT - Timeout for job execution (seconds)
  • BIGQUERY_JOB_RETRIES - Number of retry attempts for failed jobs

Quick Install

# Install SQLMesh with BigQuery support
pip install "sqlmesh[bigquery]"

# Set up authentication with service account
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"

# Or use alternative variable name
export BIGQUERY_KEYFILE="/path/to/service-account.json"

# Configure in config.yaml
cat > config.yaml << EOF
gateways:
  bigquery:
    connection:
      type: bigquery
      method: service-account
      project: your-gcp-project
      location: US
      keyfile: /path/to/service-account.json
      job_execution_timeout_seconds: 3600
      job_retries: 3
EOF

# Optional: Install BigFrames support
pip install "bigframes>=1.32.0"

Code Evidence

# File: pyproject.toml:46-48
bigquery = [
    "google-cloud-bigquery[pandas]",
    "google-cloud-bigquery-storage",
]

# File: pyproject.toml:52-53
# Note: bigframes is separate due to version pin conflicts with sqlglot
bigframes = ["bigframes>=1.32.0"]
# File: sqlmesh/core/config/connection.py:1117-1137
class BigQueryConnectionMethod(str, Enum):
    """BigQuery authentication methods."""
    OAUTH = "oauth"
    SERVICE_ACCOUNT = "service-account"
    SERVICE_ACCOUNT_JSON = "service-account-json"
    OAUTH_SECRETS = "oauth-secrets"

class BigQueryConnectionConfig(ConnectionConfig):
    """BigQuery connection configuration."""
    method: BigQueryConnectionMethod
    project: str
    location: t.Optional[str] = None
    keyfile: t.Optional[str] = None
    keyfile_json: t.Optional[t.Dict[str, t.Any]] = None
    impersonated_service_account: t.Optional[str] = None
    job_creation_timeout_seconds: t.Optional[int] = None
    job_execution_timeout_seconds: t.Optional[int] = None
    job_retries: t.Optional[int] = None
    # ... additional configuration
# File: sqlmesh/core/engine_adapter/bigquery.py
class BigQueryEngineAdapter(EngineAdapter):
    """BigQuery-specific engine adapter implementation."""

    def _execute_query(self, query: str) -> t.Any:
        """Execute query with timeout and retry logic."""
        job_config = QueryJobConfig(
            use_query_cache=True,
            timeout_ms=self.job_execution_timeout_seconds * 1000,
        )
        # ... implementation with retry logic

Common Errors

Error Message Cause Solution
403: BigQuery API has not been enabled BigQuery API disabled in GCP project Enable BigQuery API in GCP Console
403: Access Denied Insufficient permissions Grant BigQuery Data Editor and BigQuery Job User roles
DefaultCredentialsError Missing or invalid credentials Set GOOGLE_APPLICATION_CREDENTIALS to valid service account key
404: Not found: Dataset Target dataset doesn't exist Create dataset or verify project/location configuration
Deadline Exceeded Query timeout Increase job_execution_timeout_seconds value
ImportError: cannot import name 'bigframes' BigFrames not installed separately Install with pip install bigframes>=1.32.0

Compatibility Notes

  • BigFrames (>=1.32.0) requires separate installation due to SQLGlot version conflicts
  • BigQuery Storage API provides 10x faster reads for large datasets
  • Service account impersonation supported for delegation scenarios
  • Supports both US and EU multi-region locations plus single regions
  • Job retries handle transient BigQuery failures automatically
  • Query cache enabled by default for cost optimization
  • Pandas integration optimized for data frame operations
  • OAuth flow requires interactive browser for initial authentication
  • Service account JSON can be passed as string or file path

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment