Principle:Apache Airflow Structlog Logging
| Knowledge Sources | |
|---|---|
| Domains | Logging, Core_Infrastructure |
| Last Updated | 2026-02-08 21:00 GMT |
Overview
Structured logging configuration pattern using the structlog library that replaces traditional format-string logging with key-value pairs, enabling machine-parseable and human-readable log output across all Airflow components.
Description
Airflow uses structlog as its structured logging foundation. Unlike traditional Python logging that relies on format strings (e.g., "Processing DAG %s"), structlog produces log events as dictionaries of key-value pairs. This approach provides several advantages:
- Machine-parseable output: Log events can be rendered as JSON for ingestion by log aggregation systems (Elasticsearch, Splunk, Loki).
- Human-readable output: The same events can be rendered as colorized, aligned console output for local development.
- Consistent context: Bound loggers carry contextual fields (dag_id, task_id, run_id) automatically attached to every log event without manual inclusion in format strings.
- Composable processing: Log events flow through a configurable chain of processors that can add fields, redact secrets, format output, or filter events.
Usage
This principle applies when configuring logging for any Airflow component, developing custom operators or hooks that emit log messages, or integrating Airflow logs with external log aggregation systems. Structured logging is the default in Airflow 3.x and replaces the legacy logging.config-based approach.
Theoretical Basis
Processor Pipeline Pattern:
Structlog operates on a processor pipeline model where each log event is a mutable dictionary that passes through an ordered chain of processors:
- Context binding: The logger binds contextual key-value pairs (e.g.,
dag_id,task_id) that persist across log calls. - Processor chain: Each processor receives the event dict, optionally transforms it, and passes it to the next processor. Processors can:
- Add fields (timestamps, log level, caller information)
- Redact sensitive values (passwords, tokens)
- Filter events by level or content
- Format the final output (JSON, key-value text, colored console)
- Rendering: The final processor converts the event dict into the output format -- either JSON for production or human-friendly text for development.
Key Structlog Concepts:
- Bound logger: A logger instance with pre-bound context fields. Created via
structlog.get_logger().bind(dag_id="x"). - Processor: A callable that takes
(logger, method_name, event_dict)and returns the modified event dict. - Renderer: The final processor that converts the event dict to a string. Common renderers include
JSONRenderer,ConsoleRenderer, and custom percent-format renderers. - Context variables: Thread-local or contextvars-based storage for logger context that automatically propagates through the call stack.
Output Formats:
| Format | Use Case | Example |
|---|---|---|
| JSON | Production, log aggregation | {"event": "Processing DAG", "dag_id": "example", "level": "info", "timestamp": "2024-01-15T12:00:00Z"}
|
| Console | Local development | [info] Processing DAG dag_id=example
|
| Percent-format | Legacy compatibility | INFO - Processing DAG example
|