Principle:Apache Airflow Secret Redaction

Knowledge Sources	Apache_Airflow Airflow Docs
Domains	Security, Logging
Last Updated	2026-02-08 21:00 GMT

Overview

Pattern-based secret detection and masking that replaces sensitive values with *** in logs, stdout, and data structures to prevent accidental credential exposure.

Description

Airflow implements a comprehensive secret redaction system that automatically detects and masks sensitive values before they appear in logs, the web UI, API responses, or any other output channel. The system operates at multiple levels:

Field-name detection: Keys and attribute names that match sensitive patterns (e.g., "password", "secret", "token", "api_key", "authorization") trigger automatic masking of their corresponding values.
Value-pattern matching: Known credential patterns (e.g., connection URIs containing passwords, bearer tokens) are detected and masked regardless of the field name.
Recursive traversal: The masker recursively walks complex data structures -- dicts, lists, tuples, sets, and custom objects -- to find and redact secrets at any depth.
Log integration: The SecretsMasker is installed as a logging filter that processes every log record before it reaches any handler, ensuring no sensitive data escapes through logging.

Usage

Secret redaction is always active in Airflow. It applies automatically to:

All log output from every Airflow component
Connection objects displayed in the UI
Variable values in the UI (when marked as sensitive)
API responses containing connection or variable data
Stdout/stderr capture from task execution

No explicit configuration is needed to enable redaction. Additional sensitive field patterns can be registered programmatically.

Theoretical Basis

Core Mechanism -- Regex-Compiled Pattern Matching:

The redaction system compiles a set of sensitive field-name patterns into a single regex. When processing data:

Compile patterns: Field names known to contain secrets (password, secret, token, api_key, private_key, authorization, etc.) are compiled into a case-insensitive regex pattern.
Recursive traversal: The masker walks the data structure depth-first. For each key-value pair, it tests the key against the compiled pattern.
Value replacement: If the key matches a sensitive pattern, the entire value is replaced with the redaction string (default: ***).
Connection URI handling: For connection strings, only the password portion is redacted, preserving the URI structure for debugging.

Detection Strategies:

Strategy	Trigger	Example
Field name match	Key matches sensitive regex pattern	`{"password": "s3cret"}` becomes `{"password": "***"}`
Connection URI	Password embedded in URI scheme	`postgres://user:s3cret@host` becomes `postgres://user:***@host`
Registered secrets	Values matching known secret values	Any occurrence of a known secret value is replaced
Nested structure	Recursive traversal of dicts/lists/objects	`{"config": {"db": {"password": "x"}}}` is fully traversed

Performance Considerations:

Patterns are pre-compiled to minimize per-record regex overhead.
The masker uses short-circuit evaluation: if no sensitive patterns are registered, the record passes through unmodified.
Large data structures are traversed lazily where possible to avoid excessive memory allocation during redaction.

Security Boundary:

Redaction is a defense-in-depth measure. It does not replace proper secret management (using secrets backends, encrypted metastore). Its purpose is to prevent accidental exposure through logs and UI output, not to serve as a primary security control.

Related Pages

Implemented By

Implementation:Apache_Airflow_SecretsMasker

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment