Principle:TobikoData Sqlmesh Audit Execution
| Knowledge Sources | |
|---|---|
| Domains | Data_Engineering, Data_Quality, Model_Development |
| Last Updated | 2026-02-07 00:00 GMT |
Overview
A systematic approach to executing data quality checks against model outputs to detect anomalies, constraint violations, and business rule failures.
Description
Data quality cannot be assumed—it must be verified. Even well-designed transformation logic can produce incorrect results due to upstream data issues, edge cases not covered in testing, or subtle bugs in complex SQL. Production data pipelines need automated quality checks that run continuously, catching problems before they propagate downstream or affect business decisions.
Audits provide this verification layer through executable data quality rules. Unlike unit tests that validate logic with fixed inputs, audits run against real production data within specified time ranges. They can check for null values in critical columns, verify row counts fall within expected ranges, ensure referential integrity across tables, validate business metrics against thresholds, and detect anomalies using statistical methods. Audits are defined alongside models, making quality requirements explicit and version-controlled.
The audit execution framework handles orchestration complexity. It determines which audits apply to which models, resolves dependencies to ensure audited data is available, executes audit queries against the database, collects and aggregates results, and reports failures with sufficient context for debugging. Audits can be run on-demand during development, automatically during deployment, or on a schedule for continuous monitoring. This flexibility enables both preventive quality checks (before deployment) and detective monitoring (after deployment).
Usage
Use audit execution during development to validate that changes maintain data quality, in deployment pipelines to prevent bad data from reaching production, and on production schedules to continuously monitor data quality. Define audits for critical business metrics, data integrity constraints, and known failure modes from past incidents.
Theoretical Basis
The audit execution algorithm orchestrates quality checks across models:
FUNCTION execute_audits(start, end, models, execution_time):
# Resolve which models to audit
IF models IS NULL THEN
target_snapshots = all_snapshots_in_environment()
ELSE
target_snapshots = [get_snapshot(m) FOR m IN models]
END IF
# Collect all audits to execute
audit_plan = []
FOR EACH snapshot IN target_snapshots:
FOR EACH audit IN snapshot.audits:
audit_plan.append({
snapshot: snapshot,
audit: audit,
start: start,
end: end
})
END FOR
END FOR
# Execute each audit
results = []
FOR EACH plan_item IN audit_plan:
# Resolve audit query with model context
audit_query = render_audit_query(
plan_item.audit,
plan_item.snapshot,
plan_item.start,
plan_item.end,
execution_time
)
# Check if audit should be skipped
IF should_skip_audit(plan_item, snapshot_intervals) THEN
results.append({
audit: plan_item.audit,
model: plan_item.snapshot.name,
status: SKIPPED
})
CONTINUE
END IF
# Execute audit query
TRY:
audit_result = execute_query(audit_query)
# Evaluate audit predicate
IF audit_predicate_passes(audit_result, plan_item.audit) THEN
results.append({
audit: plan_item.audit,
model: plan_item.snapshot.name,
status: PASSED
})
ELSE
results.append({
audit: plan_item.audit,
model: plan_item.snapshot.name,
status: FAILED,
details: audit_result
})
END IF
CATCH error:
results.append({
audit: plan_item.audit,
model: plan_item.snapshot.name,
status: ERROR,
error: error
})
END TRY
END FOR
# Report results
failures = [r FOR r IN results IF r.status = FAILED OR r.status = ERROR]
log_audit_summary(results)
RETURN failures.is_empty()
END FUNCTION
The key insight is treating audits as first-class queries that execute in the same context as models, with access to time ranges, execution time, and snapshot references, but with pass/fail evaluation semantics rather than data output.