Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Astronomer Astronomer cosmos Local Operator Execution

From Leeroopedia


Knowledge Sources
Domains Data_Engineering, Execution, Orchestration
Last Updated 2026-02-07 00:00 GMT

Overview

An execution principle for running dbt commands as local subprocesses or in-process dbt runners within the Airflow worker environment.

Description

Local execution mode runs dbt commands directly on the Airflow worker node. This is the simplest and most common execution mode in astronomer-cosmos, and it serves as the default when no explicit ExecutionMode is specified. It supports two invocation strategies controlled by the InvocationMode enum:

  • SUBPROCESS -- shells out to the dbt CLI binary (e.g., dbt run --select model_name) via the Airflow subprocess hook. This is the traditional approach and works with any dbt version installed on the system PATH.
  • DBT_RUNNER -- uses dbt's programmatic Python API (dbtRunner) to invoke dbt commands in-process. This avoids subprocess overhead, enables richer result parsing (including OpenLineage event collection), and is the preferred mode for dbt-core >= 1.5.

The fundamental trade-off is simplicity vs. isolation. Local mode requires dbt and all its adapter dependencies (e.g., dbt-postgres, dbt-snowflake) to be installed directly in the Airflow worker environment. This means dbt dependency versions must be compatible with the Airflow installation. For scenarios where isolation is required, other execution modes (Kubernetes, Docker, VirtualEnv) should be considered.

Usage

Use local execution mode for:

  • Development and testing -- simplest setup, no additional infrastructure needed
  • Simple deployments -- when Airflow workers have direct access to the data warehouse
  • Environments where dbt is installed alongside Airflow -- common in Astro Runtime and custom Docker images
  • Low-latency requirements -- avoids pod startup or container creation overhead

Local execution mode is configured via ExecutionConfig:

from cosmos import ExecutionConfig, ExecutionMode, InvocationMode

execution_config = ExecutionConfig(
    execution_mode=ExecutionMode.LOCAL,
    invocation_mode=InvocationMode.DBT_RUNNER,  # or InvocationMode.SUBPROCESS
)

Theoretical Basis

Each dbt node in the Airflow DAG is executed as an individual dbt command. For example, a model named stg_customers triggers:

dbt run --select stg_customers --project-dir /tmp/cosmos_project --profiles-dir /tmp/cosmos_profiles

The operator manages several concerns for each invocation:

  • Temporary profile generation -- a profiles.yml is dynamically generated from the Airflow connection (via ProfileConfig) and written to a temporary directory
  • Environment variable injection -- Airflow variables, connection secrets, and user-specified env vars are merged into the dbt process environment
  • Working directory setup -- the dbt project is cloned to a temporary directory to avoid filesystem conflicts between concurrent tasks
  • Partial parse caching -- the partial_parse.msgpack file is cached and restored between runs to speed up dbt parsing
  • Dependency installation -- when install_deps=True, dbt deps is run before the main command
  • Result parsing -- dbt output is parsed for success/failure status, compiled SQL is extracted, and OpenLineage events are collected (DBT_RUNNER mode only)

This per-node execution model aligns with Airflow's task-level retry, logging, and observability semantics -- each dbt model gets its own task instance with independent success/failure tracking.

Related Pages

Implemented By

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment