Principle:Apache Airflow Database Backend Configuration

Knowledge Sources	Airflow Database Setup
Domains	Database, Infrastructure
Last Updated	2026-02-08 00:00 GMT

Overview

A configuration pattern for setting up Airflow's metadata database and Celery result backend connections in Kubernetes deployments.

Description

Database Backend Configuration covers how Airflow connects to its metadata database (PostgreSQL) and optional Celery result backend (Redis or PostgreSQL) in a Kubernetes environment. The Helm chart supports both embedded database (via Bitnami PostgreSQL subchart) and external database configurations. PgBouncer can be deployed as a sidecar for connection pooling. Database migrations are handled by an init job that runs before component startup.

Usage

Configure database connections when deploying Airflow on Kubernetes. Use the embedded PostgreSQL for development, external managed databases (RDS, Cloud SQL) for production. Enable PgBouncer when connection counts are a concern.

Theoretical Basis

Connection Architecture:

Metadata DB: Stores all Airflow state (DAGs, runs, tasks, connections, variables)
Result Backend: Stores Celery task results (only needed with CeleryExecutor)
Connection Pooling: PgBouncer reduces database connection overhead

Migration Strategy:

Init container runs airflow db migrate before component startup
Migrations are idempotent and safe to re-run
Schema versioned via Alembic revision chain

Related Pages

Implemented By

Implementation:Apache_Airflow_Database_Connection_Config

Uses Heuristic

Heuristic:Apache_Airflow_Database_Lock_Handling

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment