Principle:Astronomer Astronomer cosmos Local Operator Execution
| Knowledge Sources | |
|---|---|
| Domains | Data_Engineering, Execution, Orchestration |
| Last Updated | 2026-02-07 00:00 GMT |
Overview
An execution principle for running dbt commands as local subprocesses or in-process dbt runners within the Airflow worker environment.
Description
Local execution mode runs dbt commands directly on the Airflow worker node. This is the simplest and most common execution mode in astronomer-cosmos, and it serves as the default when no explicit ExecutionMode is specified. It supports two invocation strategies controlled by the InvocationMode enum:
- SUBPROCESS -- shells out to the dbt CLI binary (e.g.,
dbt run --select model_name) via the Airflow subprocess hook. This is the traditional approach and works with any dbt version installed on the system PATH. - DBT_RUNNER -- uses dbt's programmatic Python API (
dbtRunner) to invoke dbt commands in-process. This avoids subprocess overhead, enables richer result parsing (including OpenLineage event collection), and is the preferred mode for dbt-core >= 1.5.
The fundamental trade-off is simplicity vs. isolation. Local mode requires dbt and all its adapter dependencies (e.g., dbt-postgres, dbt-snowflake) to be installed directly in the Airflow worker environment. This means dbt dependency versions must be compatible with the Airflow installation. For scenarios where isolation is required, other execution modes (Kubernetes, Docker, VirtualEnv) should be considered.
Usage
Use local execution mode for:
- Development and testing -- simplest setup, no additional infrastructure needed
- Simple deployments -- when Airflow workers have direct access to the data warehouse
- Environments where dbt is installed alongside Airflow -- common in Astro Runtime and custom Docker images
- Low-latency requirements -- avoids pod startup or container creation overhead
Local execution mode is configured via ExecutionConfig:
from cosmos import ExecutionConfig, ExecutionMode, InvocationMode
execution_config = ExecutionConfig(
execution_mode=ExecutionMode.LOCAL,
invocation_mode=InvocationMode.DBT_RUNNER, # or InvocationMode.SUBPROCESS
)
Theoretical Basis
Each dbt node in the Airflow DAG is executed as an individual dbt command. For example, a model named stg_customers triggers:
dbt run --select stg_customers --project-dir /tmp/cosmos_project --profiles-dir /tmp/cosmos_profiles
The operator manages several concerns for each invocation:
- Temporary profile generation -- a
profiles.ymlis dynamically generated from the Airflow connection (via ProfileConfig) and written to a temporary directory - Environment variable injection -- Airflow variables, connection secrets, and user-specified env vars are merged into the dbt process environment
- Working directory setup -- the dbt project is cloned to a temporary directory to avoid filesystem conflicts between concurrent tasks
- Partial parse caching -- the
partial_parse.msgpackfile is cached and restored between runs to speed up dbt parsing - Dependency installation -- when
install_deps=True,dbt depsis run before the main command - Result parsing -- dbt output is parsed for success/failure status, compiled SQL is extracted, and OpenLineage events are collected (DBT_RUNNER mode only)
This per-node execution model aligns with Airflow's task-level retry, logging, and observability semantics -- each dbt model gets its own task instance with independent success/failure tracking.