Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Astronomer Astronomer cosmos DbtDocsCloudOperator Init

From Leeroopedia


Metadata

Field Value
Page Type Implementation
Knowledge Sources Repo (astronomer-cosmos), Doc (Cosmos Generating Docs)
Domains Data_Engineering, Documentation, Cloud_Storage
Last Updated 2026-02-07 14:00 GMT

Overview

Concrete tool for generating dbt docs and uploading to cloud storage (S3, GCS, Azure) provided by the astronomer-cosmos library. The DbtDocsCloudLocalOperator abstract base class and its provider-specific subclasses combine documentation generation with cloud upload in a single operator execution.

Description

The cloud documentation operators form a class hierarchy:

  • DbtDocsCloudLocalOperator (abstract): Extends DbtDocsLocalOperator to add cloud upload capabilities. Defines the abstract method upload_to_cloud_storage() and orchestrates the generate-then-upload workflow in its execute() method.
  • DbtDocsS3LocalOperator: Implements S3 upload using the Airflow S3Hook. Uploads each documentation artifact to the specified S3 bucket and optional folder path.
  • DbtDocsGCSLocalOperator: Implements GCS upload using the Airflow GCSHook. Uploads each documentation artifact to the specified GCS bucket and optional folder path.
  • DbtDocsAzureStorageLocalOperator: Implements Azure Blob Storage upload using the Airflow WasbHook. Uses a container name (equivalent to bucket) and optional folder path.

Each subclass overrides upload_to_cloud_storage() to implement provider-specific upload logic. The base class execute() method first calls the parent DbtDocsLocalOperator.execute() to generate the docs, then calls upload_to_cloud_storage() to transfer the artifacts.

Code Reference

Source Location

Source File Lines
astronomer-cosmos repo cosmos/operators/local.py L1182-1367

Signatures

class DbtDocsCloudLocalOperator(DbtDocsLocalOperator, ABC):
    """Abstract base class for uploading dbt docs to cloud storage."""

    def __init__(
        self,
        connection_id: str,
        bucket_name: str,
        folder_dir: str | None = None,
        **kwargs: Any,
    ) -> None:
        self.connection_id = connection_id
        self.bucket_name = bucket_name
        self.folder_dir = folder_dir
        super().__init__(**kwargs)

    @abstractmethod
    def upload_to_cloud_storage(self, project_dir: str, **kwargs: Any) -> None:
        ...


class DbtDocsS3LocalOperator(DbtDocsCloudLocalOperator):
    """Upload dbt docs to S3."""

    def __init__(
        self, *args: Any, aws_conn_id: str | None = None, **kwargs: Any
    ) -> None:
        ...

    def upload_to_cloud_storage(self, project_dir: str, **kwargs: Any) -> None:
        ...


class DbtDocsAzureStorageLocalOperator(DbtDocsCloudLocalOperator):
    """Upload dbt docs to Azure Blob Storage."""

    def __init__(
        self,
        *args: Any,
        azure_conn_id: str | None = None,
        container_name: str | None = None,
        **kwargs: Any,
    ) -> None:
        ...


class DbtDocsGCSLocalOperator(DbtDocsCloudLocalOperator):
    """Upload dbt docs to Google Cloud Storage."""

    def upload_to_cloud_storage(self, project_dir: str, **kwargs: Any) -> None:
        ...

Import

from cosmos.operators.local import (
    DbtDocsS3LocalOperator,
    DbtDocsGCSLocalOperator,
    DbtDocsAzureStorageLocalOperator,
)

I/O Contract

Inputs (DbtDocsCloudLocalOperator base)

Parameter Type Required Description
connection_id str Yes Airflow connection ID for authenticating with the cloud storage provider.
bucket_name str Yes Name of the cloud storage bucket (S3 bucket, GCS bucket, or Azure container) to upload artifacts to.
folder_dir str or None No Optional subdirectory path within the bucket for organizing uploaded files. Defaults to None (files uploaded to bucket root).
project_dir str Yes Path to the dbt project directory (inherited from DbtDocsLocalOperator).
profile_config ProfileConfig Yes Database connection profile configuration (inherited from DbtLocalBaseOperator).

Provider-Specific Inputs

Operator Parameter Type Description
DbtDocsS3LocalOperator aws_conn_id str or None Overrides connection_id specifically for AWS. Falls back to connection_id if not provided.
DbtDocsAzureStorageLocalOperator azure_conn_id str or None Overrides connection_id specifically for Azure. Falls back to connection_id if not provided.
DbtDocsAzureStorageLocalOperator container_name str or None Overrides bucket_name specifically for Azure containers. Falls back to bucket_name if not provided.

Outputs

Output Type Description
Generated documentation artifacts Files (local) index.html, manifest.json, and catalog.json in the dbt target directory (generated first).
Uploaded documentation artifacts Cloud objects The same artifacts uploaded to the specified cloud storage bucket and optional folder path.

Usage Examples

Upload dbt Docs to S3

from cosmos.operators.local import DbtDocsS3LocalOperator

upload_docs_s3 = DbtDocsS3LocalOperator(
    task_id="upload_dbt_docs_to_s3",
    project_dir="/usr/local/airflow/dags/dbt/my_project",
    profile_config=profile_config,
    connection_id="my_aws_conn",
    bucket_name="my-dbt-docs-bucket",
    folder_dir="dbt_docs/prod",
)

Upload dbt Docs to GCS

from cosmos.operators.local import DbtDocsGCSLocalOperator

upload_docs_gcs = DbtDocsGCSLocalOperator(
    task_id="upload_dbt_docs_to_gcs",
    project_dir="/usr/local/airflow/dags/dbt/my_project",
    profile_config=profile_config,
    connection_id="my_gcp_conn",
    bucket_name="my-dbt-docs-gcs-bucket",
    folder_dir="dbt_docs/prod",
)

Upload dbt Docs to Azure Blob Storage

from cosmos.operators.local import DbtDocsAzureStorageLocalOperator

upload_docs_azure = DbtDocsAzureStorageLocalOperator(
    task_id="upload_dbt_docs_to_azure",
    project_dir="/usr/local/airflow/dags/dbt/my_project",
    profile_config=profile_config,
    connection_id="my_wasb_conn",
    bucket_name="my-dbt-docs-container",
    container_name="my-dbt-docs-container",
    folder_dir="dbt_docs/prod",
)

Full DAG Example: Generate and Upload

from airflow import DAG
from datetime import datetime
from cosmos.operators.local import DbtDocsS3LocalOperator

with DAG(
    dag_id="dbt_docs_s3_pipeline",
    start_date=datetime(2024, 1, 1),
    schedule_interval="@daily",
) as dag:
    # DbtDocsS3LocalOperator generates AND uploads in one step
    generate_and_upload = DbtDocsS3LocalOperator(
        task_id="generate_and_upload_docs",
        project_dir="/usr/local/airflow/dags/dbt/my_project",
        profile_config=profile_config,
        connection_id="aws_default",
        bucket_name="company-dbt-docs",
        folder_dir="my_project",
    )

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment