Workflow:Astronomer Astronomer cosmos Kubernetes dbt execution

Knowledge Sources	Astronomer Cosmos Cosmos Docs Kubernetes Mode
Domains	Data_Engineering, dbt, Airflow, Kubernetes, Orchestration
Last Updated	2026-02-07 17:00 GMT

Overview

End-to-end process for running dbt models in isolated Kubernetes pods using Cosmos Kubernetes execution mode, providing resource isolation and environment independence from the Airflow worker.

Description

This workflow covers the procedure for executing dbt commands inside Kubernetes pods using ExecutionMode.KUBERNETES. Each dbt node (model, seed, test, snapshot) runs as a separate KubernetesPodOperator-based task, launching a containerized dbt environment. This provides complete isolation between dbt and Airflow environments, allowing different dbt versions, dependencies, and resource allocations per task. The dbt project and its dependencies are packaged into a Docker image, while database credentials are managed through Kubernetes Secrets.

The graph parsing still happens on the Airflow controller using the local dbt project files (via RenderConfig.dbt_project_path), while execution happens in Kubernetes pods using the containerized project path (via ExecutionConfig.dbt_project_path). This dual-path design separates the parsing environment from the execution environment.

Usage

Execute this workflow when dbt and Airflow have conflicting Python dependencies, when dbt tasks need more resources (CPU/memory) than the Airflow worker provides, when you need strict environment isolation between dbt versions, or when running in a Kubernetes-native environment like Astronomer or GKE. This mode is suitable for production deployments where stability and resource control are critical.

Execution Steps

Step 1: Build the dbt Docker image

Create a Docker image containing the dbt project files, dbt executable, and all required database adapters. The image should include the dbt project at a known path (e.g., dags/dbt/jaffle_shop) and have dbt properly installed with the necessary adapter packages. A profiles.yml can be embedded in the image or generated at runtime from environment variables.

Key considerations:

The Docker image must include both dbt-core and the appropriate database adapter
Project files in the image should match the structure expected by the dbt commands
Use multi-stage builds to keep the image size manageable
Tag images with version numbers for reproducibility

Step 2: Configure Kubernetes Secrets

Define Kubernetes Secret objects for sensitive configuration values such as database passwords and host addresses. Secrets are injected into the pod as environment variables at runtime, keeping credentials out of DAG code and Docker images. Each secret maps a Kubernetes secret key to an environment variable name.

Key considerations:

Secrets must exist in the Kubernetes namespace where pods will run
Use deploy_type="env" to inject secrets as environment variables
The deploy_target name must match what the dbt profiles.yml expects
Multiple secrets can be combined for different credential components

Step 3: Configure dual project paths

Set up the RenderConfig with the local Airflow controller path to the dbt project (for DAG parsing) and the ExecutionConfig with the container-internal path (for runtime execution). The render path is used by Cosmos to discover dbt nodes at parse time, while the execution path tells the Kubernetes pod where to find the project files.

Key considerations:

RenderConfig.dbt_project_path points to the local filesystem path accessible during DAG parsing
ExecutionConfig.dbt_project_path points to the path inside the Docker container
ExecutionConfig.execution_mode must be set to ExecutionMode.KUBERNETES
The project structure must be consistent between both paths

Step 4: Configure profile for Kubernetes context

Create a ProfileConfig that works in both contexts. For DAG parsing, a profile mapping (e.g., PostgresUserPasswordProfileMapping) resolves the Airflow connection. For Kubernetes execution, the pod uses the profiles.yml baked into the Docker image or one generated from environment variables injected via Secrets.

Key considerations:

Profile mapping is used for DAG parsing but may not be exposed inside the K8s pod
The pod relies on environment variables (from Secrets) and the baked-in profiles.yml
The profile_name and target_name must match in both contexts
Database connection details flow from Kubernetes Secrets to dbt via environment variables

Step 5: Seed data loading

Use DbtSeedKubernetesOperator to load seed data into the database before running models. This operator runs dbt seed inside a Kubernetes pod with the same image and secrets. Seed loading is typically a separate upstream task that runs before the main model TaskGroup.

Key considerations:

Seeds must be loaded before dependent models run
The seed operator uses the same Docker image and Kubernetes secrets as the model operators
is_delete_operator_pod controls whether pods are cleaned up after execution
get_logs=True streams pod logs back to the Airflow task log

Step 6: Run models via DbtTaskGroup

Create a DbtTaskGroup with ExecutionMode.KUBERNETES to run all dbt models. Each model becomes a separate Kubernetes pod. Pass operator_args including the Docker image name, secrets, environment variables, and pod lifecycle settings. Cosmos maps each dbt node to a DbtRunKubernetesOperator, DbtTestKubernetesOperator, etc.

Key considerations:

Each dbt node spawns a separate Kubernetes pod
Pod resource requests and limits can be set via operator_args
is_delete_operator_pod=False keeps pods around for debugging
Dependencies between models are preserved as Airflow task dependencies

Step 7: Wire seed and model tasks

Establish the dependency chain with seed loading running first, followed by the model TaskGroup. This ensures that seed data is available in the database before models that depend on it are executed.

Key considerations:

Use the >> operator to set load_seeds >> run_models
Additional pre-processing or post-processing tasks can be added to the chain
The TaskGroup encapsulates all model-level dependencies internally

Execution Diagram

GitHub URL

Workflow Repository