Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Apache Airflow Database Backend Environment

From Leeroopedia


Knowledge Sources
Domains Infrastructure, Database
Last Updated 2026-02-08 20:00 GMT

Overview

Metadata database backend supporting PostgreSQL 13-18, MySQL 8.0/8.4, or SQLite 3.15+ with SQLAlchemy 2.0.36+ async support.

Description

Apache Airflow requires a metadata database to store DAG definitions, task instance states, connection information, variables, and scheduling metadata. The database backend is configured via the `sql_alchemy_conn` configuration option. PostgreSQL is the recommended production backend (default version 14), while SQLite is suitable only for development and testing with the `airflow standalone` command. MySQL 8.0/8.4 and MariaDB 10.11 LTS are also supported. The database layer uses SQLAlchemy 2.0+ with async support and Alembic for schema migrations.

Usage

Use this environment for all Airflow deployments that persist state. Every component that reads or writes metadata (scheduler, API server, workers, triggerer) requires access to this database backend. The database migration system tracks schema versions from Airflow 2.6.2 through 3.2.0.

System Requirements

Category Requirement Notes
PostgreSQL 13, 14, 15, 16, 17, or 18 Default version: 14. Recommended for production.
MySQL 8.0 or 8.4 UTF-8 charset required in connection string
MariaDB 10.11 LTS Recommended over Oracle MySQL client
SQLite >= 3.15.0 Development only; not for production
SQLAlchemy >= 2.0.36 Required for Python 3.13 compatibility
Alembic >= 1.13.1, < 2.0 Database migration tool

Dependencies

System Packages

  • `libpq-dev` (PostgreSQL client development libraries)
  • `postgresql-client` (PostgreSQL CLI tools)
  • `default-libmysqlclient-dev` (MySQL client development libraries)
  • `sqlite3` (SQLite CLI)
  • `freetds-dev` (MSSQL/SQL Server client)

Python Packages

  • `sqlalchemy[asyncio]` >= 2.0.36
  • `alembic` >= 1.13.1, < 2.0
  • `aiosqlite` >= 0.20.0, < 0.22.0
  • `psycopg2-binary` (for PostgreSQL)
  • `mysqlclient` (for MySQL/MariaDB)

Credentials

  • `AIRFLOW__DATABASE__SQL_ALCHEMY_CONN`: Primary database connection string.
    • PostgreSQL: `postgresql+psycopg2://user:pass@host:5432/airflow`
    • MySQL: `mysql+mysqldb://user:pass@host:3306/airflow?charset=utf8`
    • SQLite: `sqlite:////path/to/airflow.db`
  • `AIRFLOW__DATABASE__SQL_ALCHEMY_CONN_ASYNC`: Async database connection string.

Quick Install

# PostgreSQL backend (recommended)
pip install "apache-airflow[postgres]"

# MySQL backend
pip install "apache-airflow[mysql]"

# Initialize the database
airflow db migrate

Code Evidence

SQLite minimum version from `airflow-core/src/airflow/configuration.py:480`:

min_sqlite_version = (3, 15, 0)

Database dialect handling from `airflow-core/src/airflow/utils/sqlalchemy.py:28-29`:

# Special handling for mysql and postgresql dialects

Database revision heads mapping from `airflow-core/src/airflow/utils/db.py:103-116`:

_REVISION_HEADS_MAP: dict[str, str] = {
    "2.6.2": "4bc4d934e2bc",
    "2.7.0": "405de8318b3a",
    "2.8.0": "10b52ebd31f7",
    ...
    "3.0.0": "29ce7909c52b",
    "3.2.0": "53ff648b8a26",
}

MySQL metadata lock workaround from `airflow-core/src/airflow/utils/db.py:639`:

# This is needed to work around MySQL 8.4 metadata lock issues with SQLAlchemy 2.0.

Advisory lock implementation from `airflow-core/src/airflow/utils/db.py:1485-1543`:

# PostgreSQL uses int64 lock IDs
# MySQL uses GET_LOCK with session-level (not transaction-level) locks
# MySQL requires AUTOCOMMIT for DDL operations

Common Errors

Error Message Cause Solution
`Server has gone away` (MySQL) Connection pool keeps expired connections Set `database__sql_alchemy_pool_recycle` to invalidate expired connections
`UnicodeDecodeError: 'ascii' codec` (MySQL) Missing charset in connection string Add `?charset=utf8` to MySQL connection URL
Metadata lock timeout (MySQL 8.4) SQLAlchemy 2.0 metadata lock issue Commit session before DDL operations (handled automatically)
Migration revision mismatch Database not migrated to current version Run `airflow db migrate`

Compatibility Notes

  • PostgreSQL: Recommended for production. Supports advisory locking via `pg_advisory_lock` with int64 IDs. Supports JSONB column type for efficient JSON storage.
  • MySQL: Requires `charset=utf8` in connection string for Unicode support. Uses `GET_LOCK` for advisory locking (session-level, not transaction-level). MySQL 8.4 has metadata lock issues requiring special handling.
  • MariaDB 10.11 LTS: Recommended over Oracle MySQL client due to GPG key expiration issues with the Oracle repository.
  • SQLite: Development only. Does not support concurrent access from multiple processes. Cannot run scheduler and webserver simultaneously.
  • PostgreSQL JSONB: Special handling for JSON columns uses PostgreSQL-specific `JSONB` type (falls back to `JSON` for other dialects).

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment