Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:TobikoData Sqlmesh Generate Test

From Leeroopedia


Knowledge Sources
Domains Data_Engineering, Model_Development, Testing
Last Updated 2026-02-07 00:00 GMT

Overview

Concrete test fixture generator for SQL models provided by SQLMesh.

Description

The generate_test function automatically creates unit test fixtures for SQLMesh models by executing user-provided queries against a production or staging database to capture input data, then running the model against that captured data to generate expected outputs. The result is a self-contained YAML test file that can be executed quickly without database access during development and CI/CD.

This function handles the complete test generation workflow: executing input queries to sample upstream dependencies, optionally capturing intermediate CTE results for debugging, loading the captured data into an in-memory test engine, executing the target model, serializing both inputs and outputs to YAML format, and writing the test file to the project's test directory. The generated tests are deterministic and version-controlled alongside the model code.

Usage

Use this function when you want to create regression tests for a model after validating it produces correct results in a development environment. Specify queries that retrieve representative samples from upstream tables, and the function will capture this data along with the model's output as a test fixture. This is especially useful when refactoring complex models or documenting expected behavior with concrete examples.

Code Reference

Source Location

  • Repository: sqlmesh
  • File: sqlmesh/core/test/definition.py (Lines 826-857)

Signature

def generate_test(
    model: Model,
    input_queries: t.Dict[str, str],
    models: UniqueKeyDict[str, Model],
    engine_adapter: EngineAdapter,
    test_engine_adapter: EngineAdapter,
    project_path: Path,
    overwrite: bool = False,
    variables: t.Optional[t.Dict[str, str]] = None,
    path: t.Optional[str] = None,
    name: t.Optional[str] = None,
    include_ctes: bool = False,
) -> None

Import

from sqlmesh.core.test import generate_test

I/O Contract

Inputs

Name Type Required Description
model Model Yes The SQLMesh model to generate a test for
input_queries Dict[str, str] Yes Mapping of model names to SQL queries that fetch sample input data
models UniqueKeyDict[str, Model] Yes All models in the context for dependency resolution
engine_adapter EngineAdapter Yes Database adapter for executing input queries against production/staging
test_engine_adapter EngineAdapter Yes In-memory database adapter for executing the model during test generation
project_path Path Yes Root directory of the SQLMesh project
overwrite bool No Whether to overwrite existing test files (default: False)
variables Dict[str, str] No Variable values to use when rendering the model
path str No Custom path for the test file relative to the test directory
name str No Custom test name (default: inferred from model name as "test_{model_name}")
include_ctes bool No Whether to capture intermediate CTE results as additional fixtures (default: False)

Outputs

Name Type Description
None None Function writes test file to disk but returns nothing

Usage Examples

Basic Usage

from sqlmesh import Context
from sqlmesh.core.test import generate_test

# Initialize context
context = Context(paths="path/to/project")

# Get the model to test
model = context.get_model("my_schema.my_model")

# Define queries to capture input data
input_queries = {
    "raw.transactions": """
        SELECT * FROM raw.transactions
        WHERE date = '2024-01-01'
        LIMIT 100
    """,
    "raw.customers": """
        SELECT * FROM raw.customers
        WHERE id IN (
            SELECT DISTINCT customer_id
            FROM raw.transactions
            WHERE date = '2024-01-01'
        )
    """
}

# Generate the test
generate_test(
    model=model,
    input_queries=input_queries,
    models=context.models,
    engine_adapter=context.engine_adapter,
    test_engine_adapter=context.test_engine_adapter,
    project_path=context.path,
    overwrite=False
)

print("Test generated successfully!")

Advanced Usage with Custom Test Name and CTEs

from pathlib import Path
from sqlmesh.core.test import generate_test

# Generate test with custom configuration
generate_test(
    model=model,
    input_queries={
        "upstream_model": "SELECT * FROM upstream WHERE key = 'test_case_1'"
    },
    models=context.models,
    engine_adapter=context.engine_adapter,
    test_engine_adapter=context.test_engine_adapter,
    project_path=Path("/path/to/project"),
    overwrite=True,  # Overwrite existing test
    variables={"env": "dev", "cutoff_date": "2024-01-01"},
    path="edge_cases/test_boundary_condition.yaml",  # Custom path
    name="test_boundary_condition",  # Custom name
    include_ctes=True  # Capture intermediate CTE results
)

# The generated test file will be at:
# /path/to/project/tests/edge_cases/test_boundary_condition.yaml

Integration with Development Workflow

# Typical workflow: validate model, then generate test

# 1. Evaluate model to verify it works
results = context.evaluate(
    "my_schema.my_model",
    start="2024-01-01",
    end="2024-01-01",
    limit=10
)
print(results)

# 2. If results look good, generate test
generate_test(
    model=context.get_model("my_schema.my_model"),
    input_queries={
        "dependency_1": "SELECT * FROM source_table WHERE date = '2024-01-01'",
    },
    models=context.models,
    engine_adapter=context.engine_adapter,
    test_engine_adapter=context.test_engine_adapter,
    project_path=context.path
)

# 3. Run the generated test
context.test()

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment