Principle:Apache Airflow Monitoring Observability
| Knowledge Sources | |
|---|---|
| Domains | Observability, Monitoring |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
A multi-backend observability framework for collecting metrics and traces from Airflow components via StatsD, OpenTelemetry, or Datadog.
Description
Monitoring and Observability in Airflow provides metrics (counters, gauges, timers) and distributed traces from all components. The Stats metaclass initializes the appropriate metrics backend based on configuration. Supported backends include StatsD (with Prometheus exporter), OpenTelemetry (OTLP exporter), and Datadog. The Helm chart supports deploying a StatsD exporter sidecar and Prometheus ServiceMonitor for Kubernetes-native monitoring.
Usage
Enable metrics in airflow.cfg or via Helm chart values. Use StatsD with Prometheus for Kubernetes environments. Use OpenTelemetry for distributed tracing across components.
Theoretical Basis
Metrics Collection Model:
- Instrumentation: Code emits metrics via Stats class (facade pattern)
- Backend: Configured backend (StatsD/OTel/Datadog) handles export
- Aggregation: External system (Prometheus/OTel Collector) aggregates
- Visualization: Grafana or similar for dashboards and alerting
Metric Types:
- Counter: Incremental counts (task completions, failures)
- Gauge: Point-in-time values (pool slots, running tasks)
- Timer: Duration measurements (task duration, scheduler loop time)