Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Astronomer Astronomer cosmos Profile Configuration

From Leeroopedia


Metadata

Field Value
Page Type Principle
Repository astronomer-cosmos
Domains Data_Engineering, Configuration, Security
Related Implementation Implementation:Astronomer_Astronomer_cosmos_ProfileConfig_Init
Knowledge Sources dbt Profiles, astronomer-cosmos

Overview

Profile Configuration is a configuration principle for mapping database connection credentials to dbt profile format within an orchestration context. It addresses the fundamental challenge of bridging two credential management systems: the orchestration platform's connection store (e.g., Airflow Connections) and dbt's profile-based connection specification.

dbt requires a profiles.yml file to know how to connect to the data warehouse. This principle abstracts the two strategies for providing that configuration: (1) supplying a pre-existing profiles.yml file, or (2) dynamically generating one from the orchestration system's connection management.

Description

dbt uses a YAML-based profile configuration to establish database connections. A profile specifies a profile name, one or more targets (named connection configurations), and the connection type with its associated credentials (host, port, user, password, database, schema, etc.).

In a standalone dbt workflow, developers maintain a profiles.yml file (typically at ~/.dbt/profiles.yml). However, in an orchestrated environment, credentials are typically managed by the orchestration platform itself. This creates a credential bridging problem: the orchestrator has credentials in its own format, but dbt expects them in profiles.yml format.

This principle abstracts two strategies for resolving this problem:

Strategy 1: File-Based Profile

A pre-existing profiles.yml file is provided at a known filesystem path. This approach is straightforward and works well when:

  • The profile is managed externally (e.g., checked into version control, mounted as a secret volume).
  • The credentials are static or managed by an external secrets manager.
  • The team prefers explicit control over the profile format.

Strategy 2: Mapping-Based Profile

The profile is dynamically generated at runtime by mapping the orchestration platform's connection object to dbt's profile schema. This approach is preferred when:

  • Credentials are managed in the orchestrator's connection store (e.g., Airflow Connections UI or environment variables).
  • The same connection is shared across multiple DAGs and should be maintained in a single location.
  • Credentials may rotate frequently, and the orchestrator handles rotation.
  • The deployment environment varies (dev/staging/prod), and connections are environment-specific.

The mapping strategy requires a profile mapping object that knows how to translate a specific connection type (e.g., Postgres, BigQuery, Snowflake, Redshift) from the orchestrator's format into the corresponding dbt profile YAML structure.

Usage

Any time a dbt command needs to connect to a database, a profile configuration must be established. This principle applies universally across all dbt execution contexts:

  • Local Execution: When dbt runs directly on the Airflow worker, the profile must be available as a file on the local filesystem.
  • Containerized Execution: When dbt runs inside a Docker container or Kubernetes pod, the profile must be injected into the container's filesystem.
  • Virtual Environment Execution: When dbt runs in an isolated Python virtual environment, the profile must be accessible from within that environment.

The profile configuration is consumed in conjunction with the Project Path Configuration — together they answer the two essential questions: where is the project? and how does it connect to the warehouse?

Theoretical Basis

dbt profiles follow a well-defined YAML schema:

my_profile_name:
  target: dev
  outputs:
    dev:
      type: postgres
      host: localhost
      port: 5432
      user: dbt_user
      password: secret
      dbname: analytics
      schema: public
      threads: 4

The schema hierarchy is:

  • Profile Name — A named collection of connection targets.
  • Target Name — The active target within the profile (e.g., dev, prod).
  • Connection Type — The database adapter type (e.g., postgres, bigquery, snowflake).
  • Credentials — Type-specific connection parameters.

This principle bridges orchestrator-managed credentials with dbt's expected format. The mapping is bidirectional in concept:

Orchestrator Concept dbt Profile Concept
Connection ID Profile Name + Target Name
Connection Type Adapter Type (type:)
Host / Port / Schema Connection parameters
Login / Password Authentication credentials
Extra JSON Additional adapter-specific settings

The separation of file-based and mapping-based strategies reflects a fundamental design tension in configuration management: static configuration (predictable, auditable, but rigid) versus dynamic configuration (flexible, DRY, but with runtime generation complexity).

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment