Principle:TobikoData Sqlmesh Dbt Full Conversion
| Knowledge Sources | |
|---|---|
| Domains | Data_Engineering, Dbt_Migration |
| Last Updated | 2026-02-07 00:00 GMT |
Overview
The comprehensive transformation of dbt model configurations into native SQLMesh model objects with full feature mapping.
Description
Dbt full conversion represents the complete translation of a dbt model's configuration, SQL logic, and metadata into a SQLMesh model. This goes beyond simple structure mapping to include sophisticated transformations of materialization strategies, incremental logic, partitioning schemes, clustering specifications, and physical properties. The conversion must handle database-specific features (like BigQuery partition expiration or Snowflake dynamic tables) while maintaining cross-platform compatibility through SQLMesh's engine adapter system.
The conversion process is the culmination of earlier migration steps (assessment, initialization, discovery), producing fully functional SQLMesh models that can be executed through SQLMesh's plan/apply workflow. It handles complex scenarios like dbt's microbatch materialization (mapping to SQLMesh's time-based incremental models), SCD Type 2 snapshots, and various incremental strategies (merge, delete+insert, append, insert_overwrite).
Usage
Use dbt full conversion when:
- Migrating dbt models to native SQLMesh format
- Converting dbt materializations to SQLMesh model kinds
- Adapting dbt incremental strategies to SQLMesh patterns
- Translating database-specific dbt configurations to SQLMesh
- Generating SQLMesh models for execution through plan/apply
- Converting dbt tests to SQLMesh audits and attaching them to models
Theoretical Basis
The full conversion process performs comprehensive model transformation:
1. Materialization Mapping:
* table → FULL kind (full refresh on each run) * view → VIEW kind (virtual, no data storage) * incremental → INCREMENTAL_BY_TIME_RANGE or INCREMENTAL_BY_UNIQUE_KEY kind * ephemeral → EMBEDDED kind (CTE, not materialized) * snapshot → SCD_TYPE_2 kind (Type 2 slowly changing dimension) * dynamic_table (Snowflake) → FULL kind with physical_properties
2. Incremental Strategy Translation:
* merge → use_merge=True with unique_key * delete+insert → time-range deletion with insert * append → insert-only incremental * insert_overwrite → partition-level overwrite * microbatch → interval-based processing with batch_size
3. Partitioning Conversion:
Parse partition_by configurations:
* List format: ["column1", "function(column2)"]
* Dict format (BigQuery): {field: "column", data_type: "date", granularity: "day"}
Convert to SQLMesh partitioned_by expressions
Handle engine-specific partition syntax
4. Clustering Conversion:
Parse cluster_by specifications: * List of column names or expressions Convert to SQLMesh clustered_by expressions Validate clustering is supported for model kind
5. Physical Properties Mapping:
Engine-specific property translation: * BigQuery: partition_expiration_days, require_partition_filter * Snowflake: warehouse, target_lag (for dynamic tables) * ClickHouse: engine (storage format), order_by, primary_key, ttl, settings * Databricks: various Spark properties
6. Pre-Statements Conversion:
* Convert sql_header to pre_statements * Add dbt_max_partition_blob for BigQuery incremental models * Preserve pre-/post-hooks as pre_statements/post_statements
7. Audit Integration:
* Map dbt tests (uniqueness, not_null, relationships) to SQLMesh audits * Attach audit definitions to converted model * Configure audit execution timing (before/after model run)
8. Schema and Naming:
* Resolve custom_schema to full schema name * Apply alias if specified * Construct fully-qualified model name * Handle package prefixes for namespacing
9. Grants and Access Control:
* Convert dbt grants config to SQLMesh grants * Only apply to model kinds that support grants * Handle empty/None grants as unmanaged
10. Special Handling:
* Set allow_partials=True for dbt models (preserves dbt semantics) * Convert begin to start for incremental models * Map batch_size to interval_unit for time-based incremental * Handle concurrent_batches by setting depends_on_past
11. Error Handling:
* Validate configurations are compatible with target engine * Log warnings for unsupported features * Raise ConfigError for invalid configurations * Provide clear error messages with model name and file path
The conversion must be idempotent: converting the same dbt model multiple times should produce the same SQLMesh model. It must also be reversible in concept: the SQLMesh model should preserve enough information that one could understand the original dbt configuration.
Critical conversion principles:
- Preserve original SQL with minimal changes (wrap in Jinja query)
- Map to most appropriate SQLMesh kind for semantics
- Warn when features cannot be fully translated
- Fail explicitly for unsupported configurations
- Maintain compatibility with SQLMesh's virtual environment system