Implementation:Astronomer Astronomer cosmos Get Dataset Alias Name
| Knowledge Sources | |
|---|---|
| Domains | Scheduling, Data_Aware |
| Last Updated | 2026-02-07 17:00 GMT |
Overview
get_dataset_alias_name generates deterministic Airflow DatasetAlias names for dbt tasks, enabling data-aware scheduling across DAGs in the Cosmos framework.
Description
The get_dataset_alias_name function constructs a unique, human-readable alias string that Airflow uses to track dataset production and consumption relationships between tasks. The alias name is derived from the combination of the parent DAG identifier, the TaskGroup path (if the task is nested within a group), and the task_id of the individual dbt task.
This naming scheme ensures that dataset aliases are globally unique within an Airflow installation while remaining predictable and traceable back to the originating dbt model. When a dbt task produces a dataset, Airflow's data-aware scheduler can use the alias to trigger downstream DAGs that have declared a dependency on that dataset.
The function resides in a focused 42-line module starting at line 16, reflecting its single-purpose design. It gracefully handles the cases where dag or task_group may be None, producing a sensible alias even in minimal configurations.
Usage
Use get_dataset_alias_name when you need to programmatically determine or reference the DatasetAlias that a Cosmos-rendered dbt task will produce. This is most commonly needed when configuring cross-DAG data-aware scheduling, writing custom rendering logic, or building tests that assert on dataset wiring.
Code Reference
Source Location
- Repository: Astronomer_Astronomer_cosmos
- File: cosmos/dataset.py
- Lines: 16 onward (module is 42 lines total)
Signature
def get_dataset_alias_name(
dag: DAG | None,
task_group: TaskGroup | None,
task_id: str,
) -> str:
...
Import
from cosmos.dataset import get_dataset_alias_name
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| dag | DAG or None | No | The Airflow DAG instance; its dag_id is used as a prefix in the alias name |
| task_group | TaskGroup or None | No | An optional TaskGroup whose group_id is incorporated into the alias for uniqueness |
| task_id | str | Yes | The identifier of the individual dbt task, forming the final segment of the alias |
Outputs
| Name | Type | Description |
|---|---|---|
| alias_name | str | A deterministic, globally unique DatasetAlias name string suitable for Airflow's data-aware scheduler |
Usage Examples
from airflow import DAG
from cosmos.dataset import get_dataset_alias_name
with DAG(dag_id="dbt_pipeline") as dag:
alias = get_dataset_alias_name(dag=dag, task_group=None, task_id="run_stg_orders")
# alias might be: "dbt_pipeline__run_stg_orders"
print(alias)
from airflow import DAG
from airflow.utils.task_group import TaskGroup
from cosmos.dataset import get_dataset_alias_name
with DAG(dag_id="dbt_pipeline") as dag:
with TaskGroup(group_id="staging") as tg:
alias = get_dataset_alias_name(dag=dag, task_group=tg, task_id="run_stg_customers")
# alias incorporates the task group path for uniqueness
print(alias)
# Minimal invocation without DAG or TaskGroup context
from cosmos.dataset import get_dataset_alias_name
alias = get_dataset_alias_name(dag=None, task_group=None, task_id="standalone_task")
print(alias)
Related Pages
- Environment:Astronomer_Astronomer_cosmos_Python_Airflow_Runtime
- Astronomer_Astronomer_cosmos_Get_Airflow_Task -- rendering function that may use dataset aliases when wiring operator outlets