Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Heuristic:Apache Airflow Database Lock Handling

From Leeroopedia




Knowledge Sources
Domains Database, Debugging, Operations
Last Updated 2026-02-08 20:00 GMT

Overview

Handle cross-database advisory locking differences: PostgreSQL uses int64 `pg_advisory_lock`, MySQL uses session-level `GET_LOCK` requiring AUTOCOMMIT for DDL operations.

Description

Apache Airflow uses advisory locks for coordinating database migrations and preventing concurrent schema modifications. The locking implementation differs significantly between PostgreSQL and MySQL backends. PostgreSQL uses transactional `pg_advisory_lock` with int64 IDs. MySQL uses `GET_LOCK()` which operates at the session level (not transaction level) and requires AUTOCOMMIT mode for DDL operations. MySQL 8.4 introduced additional metadata lock issues with SQLAlchemy 2.0 that require explicit session commits before DDL statements.

Usage

Apply this heuristic when running database migrations (`airflow db migrate`), encountering lock timeouts, or debugging deadlocks during concurrent Airflow component startups. This is especially relevant in Kubernetes deployments where multiple pods may attempt to run migrations simultaneously.

The Insight (Rule of Thumb)

  • Action 1 (PostgreSQL): Advisory locks use int64 IDs and are transactional — they release automatically when the transaction ends. No special handling needed.
  • Action 2 (MySQL): Advisory locks via `GET_LOCK()` are session-level, not transaction-level. They persist until the session ends or `RELEASE_LOCK()` is called. Always use AUTOCOMMIT for DDL operations.
  • Action 3 (MySQL 8.4): Commit the session before executing DDL statements to release metadata locks. SQLAlchemy 2.0 changed behavior that causes MySQL 8.4 metadata lock issues.
  • Action 4 (Debugging): Enable faulthandler (`import faulthandler; faulthandler.enable()`) for debugging long-running threads and deadlocks during migration.
  • Trade-off: Session-level locks in MySQL mean a crashed process may hold locks until the connection timeout. PostgreSQL's transactional locks are more predictable but require the transaction to remain open.

Reasoning

Evidence from source code:

From `airflow-core/src/airflow/utils/db.py:1485-1543` (DBLocks enum and implementation):

# PostgreSQL uses int64 lock IDs via pg_advisory_lock
# MySQL uses GET_LOCK with session-level (not transaction-level) locks
# MySQL requires AUTOCOMMIT for DDL operations

From `airflow-core/src/airflow/utils/db.py:639`:

# This is needed to work around MySQL 8.4 metadata lock issues with SQLAlchemy 2.0.

From `airflow-core/src/airflow/utils/db.py:691-692`:

# MySQL: Commit session to release metadata locks before DDL
log.info("MySQL: Committing session to release metadata locks")

From `airflow-core/src/airflow/utils/db.py:756`:

# Enable faulthandler for debugging long-running threads and deadlocks

From `airflow-core/src/airflow/utils/db.py:987`:

# NOTE: SELECT queries in this function are INTENTIONALLY written with the
# SQL builder style, not the ORM query API. This avoids configuring the ORM
# unless we need to insert something, speeding up CLI in general.

The last comment reveals a performance heuristic: using raw SQL builder style instead of ORM queries for CLI operations avoids ORM configuration overhead, making commands like `airflow db check` faster.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment