Implementation:TobikoData Sqlmesh Generate Test
| Knowledge Sources | |
|---|---|
| Domains | Data_Engineering, Model_Development, Testing |
| Last Updated | 2026-02-07 00:00 GMT |
Overview
Concrete test fixture generator for SQL models provided by SQLMesh.
Description
The generate_test function automatically creates unit test fixtures for SQLMesh models by executing user-provided queries against a production or staging database to capture input data, then running the model against that captured data to generate expected outputs. The result is a self-contained YAML test file that can be executed quickly without database access during development and CI/CD.
This function handles the complete test generation workflow: executing input queries to sample upstream dependencies, optionally capturing intermediate CTE results for debugging, loading the captured data into an in-memory test engine, executing the target model, serializing both inputs and outputs to YAML format, and writing the test file to the project's test directory. The generated tests are deterministic and version-controlled alongside the model code.
Usage
Use this function when you want to create regression tests for a model after validating it produces correct results in a development environment. Specify queries that retrieve representative samples from upstream tables, and the function will capture this data along with the model's output as a test fixture. This is especially useful when refactoring complex models or documenting expected behavior with concrete examples.
Code Reference
Source Location
- Repository: sqlmesh
- File:
sqlmesh/core/test/definition.py(Lines 826-857)
Signature
def generate_test(
model: Model,
input_queries: t.Dict[str, str],
models: UniqueKeyDict[str, Model],
engine_adapter: EngineAdapter,
test_engine_adapter: EngineAdapter,
project_path: Path,
overwrite: bool = False,
variables: t.Optional[t.Dict[str, str]] = None,
path: t.Optional[str] = None,
name: t.Optional[str] = None,
include_ctes: bool = False,
) -> None
Import
from sqlmesh.core.test import generate_test
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model | Model | Yes | The SQLMesh model to generate a test for |
| input_queries | Dict[str, str] | Yes | Mapping of model names to SQL queries that fetch sample input data |
| models | UniqueKeyDict[str, Model] | Yes | All models in the context for dependency resolution |
| engine_adapter | EngineAdapter | Yes | Database adapter for executing input queries against production/staging |
| test_engine_adapter | EngineAdapter | Yes | In-memory database adapter for executing the model during test generation |
| project_path | Path | Yes | Root directory of the SQLMesh project |
| overwrite | bool | No | Whether to overwrite existing test files (default: False) |
| variables | Dict[str, str] | No | Variable values to use when rendering the model |
| path | str | No | Custom path for the test file relative to the test directory |
| name | str | No | Custom test name (default: inferred from model name as "test_{model_name}") |
| include_ctes | bool | No | Whether to capture intermediate CTE results as additional fixtures (default: False) |
Outputs
| Name | Type | Description |
|---|---|---|
| None | None | Function writes test file to disk but returns nothing |
Usage Examples
Basic Usage
from sqlmesh import Context
from sqlmesh.core.test import generate_test
# Initialize context
context = Context(paths="path/to/project")
# Get the model to test
model = context.get_model("my_schema.my_model")
# Define queries to capture input data
input_queries = {
"raw.transactions": """
SELECT * FROM raw.transactions
WHERE date = '2024-01-01'
LIMIT 100
""",
"raw.customers": """
SELECT * FROM raw.customers
WHERE id IN (
SELECT DISTINCT customer_id
FROM raw.transactions
WHERE date = '2024-01-01'
)
"""
}
# Generate the test
generate_test(
model=model,
input_queries=input_queries,
models=context.models,
engine_adapter=context.engine_adapter,
test_engine_adapter=context.test_engine_adapter,
project_path=context.path,
overwrite=False
)
print("Test generated successfully!")
Advanced Usage with Custom Test Name and CTEs
from pathlib import Path
from sqlmesh.core.test import generate_test
# Generate test with custom configuration
generate_test(
model=model,
input_queries={
"upstream_model": "SELECT * FROM upstream WHERE key = 'test_case_1'"
},
models=context.models,
engine_adapter=context.engine_adapter,
test_engine_adapter=context.test_engine_adapter,
project_path=Path("/path/to/project"),
overwrite=True, # Overwrite existing test
variables={"env": "dev", "cutoff_date": "2024-01-01"},
path="edge_cases/test_boundary_condition.yaml", # Custom path
name="test_boundary_condition", # Custom name
include_ctes=True # Capture intermediate CTE results
)
# The generated test file will be at:
# /path/to/project/tests/edge_cases/test_boundary_condition.yaml
Integration with Development Workflow
# Typical workflow: validate model, then generate test
# 1. Evaluate model to verify it works
results = context.evaluate(
"my_schema.my_model",
start="2024-01-01",
end="2024-01-01",
limit=10
)
print(results)
# 2. If results look good, generate test
generate_test(
model=context.get_model("my_schema.my_model"),
input_queries={
"dependency_1": "SELECT * FROM source_table WHERE date = '2024-01-01'",
},
models=context.models,
engine_adapter=context.engine_adapter,
test_engine_adapter=context.test_engine_adapter,
project_path=context.path
)
# 3. Run the generated test
context.test()