Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Apache Airflow Stats Singleton

From Leeroopedia


Knowledge Sources
Domains Observability, Metrics
Last Updated 2026-02-08 21:00 GMT

Overview

Global Stats singleton that lazy-loads the appropriate metrics logger (StatsD, DataDog, OTel, or NoOp), serving as the unified entry point for all metrics emission in Airflow.

Description

The Stats class is a module-level singleton implemented via a custom metaclass (_Stats). Rather than being instantiated directly, Stats delegates all attribute access to an underlying StatsLogger or NoStatsLogger instance. On first access, the metaclass lazily invokes a factory function to create the appropriate backend logger. The factory is configured by calling Stats.initialize() with boolean flags indicating which metrics backend is active. If no factory is set or initialization fails (e.g., DNS resolution or import errors), the system falls back to NoStatsLogger, which silently discards all metrics.

The module also provides normalize_name_for_stats(), a utility that sanitizes metric names by replacing invalid characters (anything outside ASCII alphanumerics, underscores, dots, and dashes) with underscores.

Usage

Stats is initialized automatically by the Airflow runtime during startup. All Airflow components emit metrics by calling class-level methods on the Stats facade (e.g., Stats.incr(), Stats.gauge()). The actual backend is determined by the [metrics] section in airflow.cfg.

Code Reference

Source Location

  • Repository: Apache Airflow
  • File: shared/observability/src/airflow_shared/observability/metrics/stats.py (123 lines)

Signature

class _Stats(type):
    """Metaclass that lazily instantiates the configured metrics logger."""
    factory: Callable[[], StatsLogger | NoStatsLogger] | None = None
    instance: StatsLogger | NoStatsLogger | None = None

    def __getattr__(cls, name: str) -> str:
        """Lazy-load the metrics backend on first attribute access."""
        ...

    def initialize(
        cls,
        *,
        is_statsd_datadog_enabled: bool,
        is_statsd_on: bool,
        is_otel_on: bool,
    ) -> None:
        """Select and configure the metrics backend factory."""
        ...

    @classmethod
    def get_constant_tags(cls, *, tags_in_string: str | None) -> list[str]:
        """Parse comma-separated constant DataDog tags."""
        ...

class Stats(metaclass=_Stats):
    """Empty class for Stats - metaclass injects the right backend."""

def normalize_name_for_stats(name: str, log_warning: bool = True) -> str:
    """Normalize a name for stats reporting by replacing invalid characters."""
    ...

Import

from airflow_shared.observability.metrics.stats import Stats
from airflow_shared.observability.metrics.stats import normalize_name_for_stats

I/O Contract

Inputs

Name Type Required Description
is_statsd_on bool Yes Enable StatsD metrics backend
is_otel_on bool Yes Enable OpenTelemetry metrics backend
is_statsd_datadog_enabled bool Yes Enable Datadog StatsD backend

Outputs

Name Type Description
Stats singleton StatsLogger / NoStatsLogger Configured metrics facade for incr, decr, gauge, timing, timer

Backend Selection Logic

The initialize() method evaluates flags in priority order:

Priority Condition Backend Selected
1 is_statsd_datadog_enabled == True DataDog DogStatsD logger
2 is_statsd_on == True StatsD logger (SafeStatsdLogger)
3 is_otel_on == True OpenTelemetry logger
4 All False NoStatsLogger (no-op)

Usage Examples

Emitting Metrics via Stats

from airflow_shared.observability.metrics.stats import Stats

# Increment a counter
Stats.incr("task_instance.completed", 1)

# Record a gauge value
Stats.gauge("scheduler.open_slots", 42)

# Time a block of code
with Stats.timer("dag_processing.duration"):
    process_dag_files()

Initializing the Backend

# Called internally by Airflow runtime during startup
Stats.initialize(
    is_statsd_datadog_enabled=False,
    is_statsd_on=True,
    is_otel_on=False,
)

Normalizing Metric Names

from airflow_shared.observability.metrics.stats import normalize_name_for_stats

safe_name = normalize_name_for_stats("my dag/task#1")
# Returns: "my_dag_task_1"

Internal Mechanics

Lazy Initialization via Metaclass

The _Stats metaclass overrides __getattr__ so that any attribute access on the Stats class (e.g., Stats.incr) triggers lazy initialization:

  1. If instance is None, invoke factory() to create the backend logger.
  2. If factory is also None, default to NoStatsLogger.
  3. If factory invocation raises socket.gaierror or ImportError, fall back to NoStatsLogger and log the error.
  4. Cache the resulting instance for all subsequent accesses.

This pattern ensures zero-cost metrics when no backend is configured and defers import of optional dependencies (statsd, opentelemetry, datadog) until they are actually needed.

Related Pages

Implements Principle

Related Implementations

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment