Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Astronomer Astronomer cosmos Get Dataset Alias Name

From Leeroopedia


Knowledge Sources
Domains Scheduling, Data_Aware
Last Updated 2026-02-07 17:00 GMT

Overview

get_dataset_alias_name generates deterministic Airflow DatasetAlias names for dbt tasks, enabling data-aware scheduling across DAGs in the Cosmos framework.

Description

The get_dataset_alias_name function constructs a unique, human-readable alias string that Airflow uses to track dataset production and consumption relationships between tasks. The alias name is derived from the combination of the parent DAG identifier, the TaskGroup path (if the task is nested within a group), and the task_id of the individual dbt task.

This naming scheme ensures that dataset aliases are globally unique within an Airflow installation while remaining predictable and traceable back to the originating dbt model. When a dbt task produces a dataset, Airflow's data-aware scheduler can use the alias to trigger downstream DAGs that have declared a dependency on that dataset.

The function resides in a focused 42-line module starting at line 16, reflecting its single-purpose design. It gracefully handles the cases where dag or task_group may be None, producing a sensible alias even in minimal configurations.

Usage

Use get_dataset_alias_name when you need to programmatically determine or reference the DatasetAlias that a Cosmos-rendered dbt task will produce. This is most commonly needed when configuring cross-DAG data-aware scheduling, writing custom rendering logic, or building tests that assert on dataset wiring.

Code Reference

Source Location

Signature

def get_dataset_alias_name(
    dag: DAG | None,
    task_group: TaskGroup | None,
    task_id: str,
) -> str:
    ...

Import

from cosmos.dataset import get_dataset_alias_name

I/O Contract

Inputs

Name Type Required Description
dag DAG or None No The Airflow DAG instance; its dag_id is used as a prefix in the alias name
task_group TaskGroup or None No An optional TaskGroup whose group_id is incorporated into the alias for uniqueness
task_id str Yes The identifier of the individual dbt task, forming the final segment of the alias

Outputs

Name Type Description
alias_name str A deterministic, globally unique DatasetAlias name string suitable for Airflow's data-aware scheduler

Usage Examples

from airflow import DAG
from cosmos.dataset import get_dataset_alias_name

with DAG(dag_id="dbt_pipeline") as dag:
    alias = get_dataset_alias_name(dag=dag, task_group=None, task_id="run_stg_orders")
    # alias might be: "dbt_pipeline__run_stg_orders"
    print(alias)
from airflow import DAG
from airflow.utils.task_group import TaskGroup
from cosmos.dataset import get_dataset_alias_name

with DAG(dag_id="dbt_pipeline") as dag:
    with TaskGroup(group_id="staging") as tg:
        alias = get_dataset_alias_name(dag=dag, task_group=tg, task_id="run_stg_customers")
        # alias incorporates the task group path for uniqueness
        print(alias)
# Minimal invocation without DAG or TaskGroup context
from cosmos.dataset import get_dataset_alias_name

alias = get_dataset_alias_name(dag=None, task_group=None, task_id="standalone_task")
print(alias)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment