Environment:TobikoData Sqlmesh BigQuery Connection
| Knowledge Sources | |
|---|---|
| Domains | BigQuery, Google Cloud, Data Warehouse |
| Last Updated | 2026-02-07 21:00 GMT |
Overview
BigQuery engine adapter connection environment enabling SQLMesh to execute transformations on Google Cloud's BigQuery data warehouse.
Description
The BigQuery connection environment integrates SQLMesh with Google Cloud BigQuery through the official Python client libraries. It supports multiple authentication methods including OAuth, service accounts, and service account JSON. The environment includes pandas integration for data transfers, BigQuery Storage API for optimized reads, and optional BigFrames support for DataFrame operations. Configuration includes job timeout controls and retry logic for robust execution.
Usage
This environment is required when using BigQuery as the execution engine for SQLMesh models, for state synchronization stored in BigQuery, or when running engine-specific tests against BigQuery. Supports service account impersonation and fine-grained job execution controls.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| Network | Access to Google Cloud APIs | HTTPS connectivity to googleapis.com |
| Authentication | Google Cloud credentials | Service account or OAuth |
| GCP Project | Active GCP project with billing | BigQuery API must be enabled |
| Permissions | BigQuery Data Editor + Job User | Minimum required permissions |
Dependencies
System Packages
- System CA certificates for HTTPS
- gRPC libraries (installed via Python packages)
Python Packages
- google-cloud-bigquery[pandas] - Official BigQuery client with pandas integration
- google-cloud-bigquery-storage - BigQuery Storage API for optimized data transfer
- bigframes>=1.32.0 - Optional BigFrames support (separate install due to SQLGlot version conflicts)
Credentials
Primary authentication methods (choose one):
- GOOGLE_APPLICATION_CREDENTIALS - Path to service account JSON key file
- BIGQUERY_KEYFILE - Alternative name for service account key file path
Authentication types supported in BigQueryConnectionConfig:
- OAUTH - Interactive OAuth flow
- SERVICE_ACCOUNT - Service account email + key file
- SERVICE_ACCOUNT_JSON - Service account credentials as JSON string
- OAUTH_SECRETS - OAuth client secrets
Optional configuration:
- BIGQUERY_PROJECT - Default GCP project ID
- BIGQUERY_LOCATION - BigQuery location/region (e.g., US, EU)
- BIGQUERY_IMPERSONATED_SERVICE_ACCOUNT - Service account to impersonate
- BIGQUERY_JOB_CREATION_TIMEOUT - Timeout for job creation (seconds)
- BIGQUERY_JOB_EXECUTION_TIMEOUT - Timeout for job execution (seconds)
- BIGQUERY_JOB_RETRIES - Number of retry attempts for failed jobs
Quick Install
# Install SQLMesh with BigQuery support
pip install "sqlmesh[bigquery]"
# Set up authentication with service account
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"
# Or use alternative variable name
export BIGQUERY_KEYFILE="/path/to/service-account.json"
# Configure in config.yaml
cat > config.yaml << EOF
gateways:
bigquery:
connection:
type: bigquery
method: service-account
project: your-gcp-project
location: US
keyfile: /path/to/service-account.json
job_execution_timeout_seconds: 3600
job_retries: 3
EOF
# Optional: Install BigFrames support
pip install "bigframes>=1.32.0"
Code Evidence
# File: pyproject.toml:46-48
bigquery = [
"google-cloud-bigquery[pandas]",
"google-cloud-bigquery-storage",
]
# File: pyproject.toml:52-53
# Note: bigframes is separate due to version pin conflicts with sqlglot
bigframes = ["bigframes>=1.32.0"]
# File: sqlmesh/core/config/connection.py:1117-1137
class BigQueryConnectionMethod(str, Enum):
"""BigQuery authentication methods."""
OAUTH = "oauth"
SERVICE_ACCOUNT = "service-account"
SERVICE_ACCOUNT_JSON = "service-account-json"
OAUTH_SECRETS = "oauth-secrets"
class BigQueryConnectionConfig(ConnectionConfig):
"""BigQuery connection configuration."""
method: BigQueryConnectionMethod
project: str
location: t.Optional[str] = None
keyfile: t.Optional[str] = None
keyfile_json: t.Optional[t.Dict[str, t.Any]] = None
impersonated_service_account: t.Optional[str] = None
job_creation_timeout_seconds: t.Optional[int] = None
job_execution_timeout_seconds: t.Optional[int] = None
job_retries: t.Optional[int] = None
# ... additional configuration
# File: sqlmesh/core/engine_adapter/bigquery.py
class BigQueryEngineAdapter(EngineAdapter):
"""BigQuery-specific engine adapter implementation."""
def _execute_query(self, query: str) -> t.Any:
"""Execute query with timeout and retry logic."""
job_config = QueryJobConfig(
use_query_cache=True,
timeout_ms=self.job_execution_timeout_seconds * 1000,
)
# ... implementation with retry logic
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| 403: BigQuery API has not been enabled | BigQuery API disabled in GCP project | Enable BigQuery API in GCP Console |
| 403: Access Denied | Insufficient permissions | Grant BigQuery Data Editor and BigQuery Job User roles |
| DefaultCredentialsError | Missing or invalid credentials | Set GOOGLE_APPLICATION_CREDENTIALS to valid service account key |
| 404: Not found: Dataset | Target dataset doesn't exist | Create dataset or verify project/location configuration |
| Deadline Exceeded | Query timeout | Increase job_execution_timeout_seconds value |
| ImportError: cannot import name 'bigframes' | BigFrames not installed separately | Install with pip install bigframes>=1.32.0 |
Compatibility Notes
- BigFrames (>=1.32.0) requires separate installation due to SQLGlot version conflicts
- BigQuery Storage API provides 10x faster reads for large datasets
- Service account impersonation supported for delegation scenarios
- Supports both US and EU multi-region locations plus single regions
- Job retries handle transient BigQuery failures automatically
- Query cache enabled by default for cost optimization
- Pandas integration optimized for data frame operations
- OAuth flow requires interactive browser for initial authentication
- Service account JSON can be passed as string or file path