Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:TobikoData Sqlmesh Data Exploration And Validation

From Leeroopedia


Knowledge Sources
Domains Data_Engineering, Model_Development, Data_Quality
Last Updated 2026-02-07 00:00 GMT

Overview

Interactive techniques for inspecting, executing, and validating data transformation logic during development through query rendering, execution, and result retrieval.

Description

Developing SQL transformations is inherently iterative. Engineers need to understand what SQL will actually execute, inspect intermediate results, validate outputs against expectations, and debug unexpected behavior. However, data transformation frameworks often abstract away the raw SQL, making it difficult to see what is really happening. Models may include macros, templating, dependency resolution, and other abstractions that obscure the final query.

Data exploration and validation capabilities bridge this gap by providing multiple levels of inspection. Query rendering shows the final SQL after macro expansion and dependency resolution, but without executing it—useful for understanding what will run and for debugging syntax errors. Query evaluation executes the rendered SQL against a database and returns results, enabling validation of transformation logic with real data. Direct query execution allows ad-hoc exploration of tables and testing of query fragments outside the model framework.

These capabilities form a tight feedback loop during development. Engineers can render a model to verify macro expansion, evaluate it with a small time range to check logic, fetch sample rows to inspect data quality, iterate on the SQL, and repeat. This interactive workflow is essential for productivity, catching errors early before committing changes or running expensive backfills.

Usage

Use query rendering to debug macro expansion issues, verify dependency resolution, or understand generated SQL before execution. Use query evaluation to validate model logic with production or test data, check incremental processing boundaries, or generate sample outputs for testing. Use direct query execution for ad-hoc data exploration, validating assumptions about upstream data, or debugging specific data quality issues.

Theoretical Basis

The exploration workflow combines SQL compilation with optional execution:

FUNCTION render_model(model, start, end, expand_refs):
    # Resolve snapshot versions for dependencies
    dependency_snapshots = resolve_dependencies(model, current_environment)

    # Build macro execution context
    macro_context = {
        execution_time: now(),
        start: start,
        end: end,
        variables: user_variables
    }

    # Expand macros in model query
    expanded_query = expand_macros(model.query, macro_context)

    # Optionally inline upstream model queries
    IF expand_refs THEN
        FOR EACH reference IN extract_references(expanded_query):
            upstream_query = render_model(reference, start, end, True)
            expanded_query = replace_reference(
                expanded_query,
                reference,
                upstream_query
            )
        END FOR
    ELSE
        # Replace model references with physical table names
        FOR EACH reference IN extract_references(expanded_query):
            physical_table = get_snapshot_table(
                reference,
                dependency_snapshots
            )
            expanded_query = replace_reference(
                expanded_query,
                reference,
                physical_table
            )
        END IF
    END IF

    RETURN expanded_query
END FUNCTION

FUNCTION evaluate_model(model, start, end, limit):
    # Render query with all transformations
    query = render_model(model, start, end, expand_refs=False)

    # Apply row limit for sampling
    IF limit IS NOT NULL THEN
        query = add_limit_clause(query, limit)
    END IF

    # Execute against database
    engine = get_engine_adapter(model.gateway)
    result_dataframe = engine.execute_query(query)

    RETURN result_dataframe
END FUNCTION

FUNCTION fetch_dataframe(raw_query):
    # Direct execution without model context
    engine = get_default_engine_adapter()
    result = engine.execute_query(raw_query)
    RETURN result
END FUNCTION

The key insight is providing multiple abstraction levels: render for SQL generation without execution, evaluate for model-aware execution with time boundaries, and fetchdf for raw query execution. Each serves different exploration needs.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment