Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Astronomer Astronomer cosmos Airflow Provider Integration

From Leeroopedia


Knowledge Sources
Domains Provider, Integration
Last Updated 2026-02-07 17:00 GMT

Overview

A registration protocol that exposes a library's operators, hooks, and configuration to an orchestration platform through declarative entry points.

Description

Airflow Provider Integration defines how Cosmos presents itself to the Airflow ecosystem as a fully recognised provider package. Airflow discovers providers at startup by scanning installed Python packages for specific entry points declared in the package metadata. Cosmos registers three such entry points, each serving a distinct integration purpose.

apache_airflow_provider is the primary discovery entry point. When Airflow finds this key, it calls the referenced function to obtain a provider info dictionary. This dictionary declares the package name, current version string, human-readable description, available config sections (which appear in airflow.cfg and can be overridden via environment variables), and supported connection types (which appear in the Airflow Connections UI). By supplying this metadata, Cosmos ensures that its configuration knobs are documented, validated, and editable through standard Airflow tooling rather than requiring out-of-band setup.

airflow.policy registers a cluster policy function. Cluster policies are hooks that Airflow calls every time a task instance is about to be queued. Cosmos uses this mechanism to implement watcher sensor queue routing: when a task instance belongs to a certain class of sensor (specifically the external-task or dataset watcher sensors that Cosmos creates for cross-DAG dependencies), the policy function overrides its queue attribute to route it to a lightweight, long-running worker pool. This prevents watcher sensors from occupying slots in the main execution pool and starving actual data processing tasks.

airflow.plugins registers a UI plugin. Airflow's plugin system allows providers to inject custom views, menu items, and static assets into the web interface. Cosmos uses this entry point to surface dbt-specific information -- such as rendered project structures or documentation links -- directly within the Airflow UI, eliminating the need for operators to switch between tools.

Together, these three entry points ensure that installing Cosmos is sufficient to activate all of its integrations; no manual configuration file editing, DAG-level imports of setup code, or admin intervention is required beyond pip install.

Usage

Apply this principle whenever Cosmos must be deployed into a new Airflow environment. Simply installing the package activates provider metadata, cluster policy routing, and UI enhancements. When extending Cosmos with new configuration sections or connection types, add them to the provider info dictionary so that Airflow exposes them through its standard interfaces. When adding new background sensors or watchers, ensure the cluster policy function accounts for their task class so they are routed to the appropriate queue.

Theoretical Basis

The provider model is an instance of the Plugin architecture pattern, where a host application defines extension points and third-party packages register implementations against those points. Python's entry_points mechanism (defined in PEP 621 and the importlib.metadata module) serves as the service locator: the host enumerates all installed packages that advertise a given entry point group and loads their contributions at startup.

The cluster policy hook exemplifies the Interceptor pattern. Rather than requiring each operator to contain routing logic, a single cross-cutting function inspects every task instance before execution and applies routing rules declaratively. This keeps operator code focused on business logic while centralising infrastructure concerns in one auditable location.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment