Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Apache Airflow DAG Distribution Strategy

From Leeroopedia


Knowledge Sources
Domains Deployment, Kubernetes
Last Updated 2026-02-08 00:00 GMT

Overview

A strategy pattern for distributing DAG files to Airflow components running in Kubernetes pods.

Description

DAG Distribution Strategy addresses how DAG Python files are made available to all Airflow pods (scheduler, workers, dag-processor) in a Kubernetes cluster. Three primary strategies exist: git-sync (sidecar container that syncs from a Git repository), PVC (shared Persistent Volume Claim), and baked-in (DAGs embedded in the Docker image). Each strategy has different trade-offs for update speed, reliability, and operational complexity.

Usage

Choose git-sync for dynamic DAG updates from version control. Use PVC when DAGs are managed outside Git. Use baked-in images for air-gapped environments or when DAG immutability is required.

Theoretical Basis

Strategy Comparison:

Strategy Update Speed Complexity Best For
Git-sync Fast (seconds) Medium CI/CD workflows
PVC Medium Low Shared filesystem setups
Baked-in Slow (rebuild) Low Air-gapped, immutable

Git-sync Architecture:

  • Sidecar container in each pod polls Git repository
  • DAGs available at shared volume mount (/opt/airflow/dags)
  • Supports authentication via SSH keys or token secrets

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment