Workflow:TobikoData Sqlmesh Model development and testing
| Knowledge Sources | |
|---|---|
| Domains | Data_Engineering, SQL_Transformation, Testing |
| Last Updated | 2026-02-07 21:00 GMT |
Overview
End-to-end process for developing SQL transformation models in isolated virtual environments, validating them with unit tests and audits, and promoting them to production.
Description
This workflow guides a data engineer through the complete model development lifecycle. It starts with writing a SQL or Python model definition, creating unit tests to validate correctness, using a virtual development environment to test against real data without affecting production, running data quality audits, and finally promoting the changes. SQLMesh's virtual environments share unchanged data between environments, making development nearly zero-cost in terms of warehouse resources.
Key outcomes:
- A validated model definition with comprehensive unit tests
- An isolated development environment for safe experimentation
- Data quality validation through audits before production promotion
- A clean promotion path to production via plan/apply
Usage
Execute this workflow when building a new data transformation model or modifying an existing one. This is the standard development loop for data engineers working with SQLMesh projects.
Execution Steps
Step 1: Model Definition
Create or modify a SQL model file in the project's models directory. The model definition includes a MODEL block specifying the model's name, kind (FULL, INCREMENTAL_BY_TIME_RANGE, SCD_TYPE_2, VIEW, etc.), and other configuration such as grain, cron schedule, and column descriptions. The query body defines the transformation logic.
Key considerations:
- Choose the appropriate model kind based on data volume and update patterns
- Define a grain (unique key columns) for models that support it
- Use the @DEF macro for reusable configuration values
- Reference upstream models by their fully qualified name
Step 2: Unit Test Creation
Write YAML-based unit tests that validate model logic by specifying mock input data and expected output. Tests can be auto-generated using the create_test command, which captures a sample of real data from upstream models and records the expected output from running the model query.
Key considerations:
- Tests execute against a local DuckDB engine for speed
- Each test specifies inputs for upstream models and the expected output
- Multiple test cases can be defined per model
- Tests are stored in the project's tests directory
Step 3: Development Environment Creation
Create an isolated virtual development environment by running plan against a named dev environment. SQLMesh creates a namespace-separated copy of the project state where only changed models are physically materialized. Unchanged models reference the same physical data as production.
Key considerations:
- Dev environments are lightweight and share unchanged data
- Specify a limited date range for backfill to reduce compute costs
- Multiple developers can have independent environments simultaneously
- Environment names are sanitized to valid SQL identifiers
Step 4: Data Exploration and Validation
Query the development environment to validate that model output matches expectations. Use the render command to inspect compiled SQL, the evaluate command to execute the model and view results, or the fetchdf command to run arbitrary queries against the dev environment.
Key considerations:
- Render shows the fully expanded SQL after macro evaluation
- Evaluate executes the model for a specific time range
- The web UI provides an interactive editor with live preview
Step 5: Audit Execution
Run data quality audits against the development environment to catch data issues before promotion. Audits include built-in checks (NOT_NULL, UNIQUE_VALUES, ACCEPTED_VALUES) and custom SQL assertions defined per model. Failed audits can be configured to block deployment.
Key considerations:
- Audits run automatically during plan application
- Custom audits are SQL queries that should return zero rows on success
- Audit severity can be set to warn or error
- Standalone audits can validate cross-model relationships
Step 6: Production Promotion
Once the development environment is validated, promote changes to production by running plan against the production environment. The plan compares the dev-tested changes against production state and generates the appropriate backfill operations. Apply the plan to deploy.
Key considerations:
- Production plans may require full historical backfill
- Changes are categorized and users confirm before execution
- The virtual layer atomically switches to new data
- Dev environment can be cleaned up after promotion