Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Heuristic:TobikoData Sqlmesh Snapshot TTL Defaults

From Leeroopedia



Knowledge Sources
Domains Configuration, Storage_Management
Last Updated 2026-02-07 21:00 GMT

Overview

SQLMesh defaults to 1 week TTL for both snapshots and environments to prevent warehouse clutter from orphaned development artifacts.

Description

SQLMesh automatically manages the lifecycle of snapshots (model versions) and virtual environments through time-to-live (TTL) settings. By default, both snapshots and environments expire after 1 week of inactivity. This is complemented by DATA_VERSION_LIMIT of 10, which caps the number of historical versions kept per model regardless of TTL. The janitor process (sqlmesh/core/janitor.py) handles cleanup of expired views, schemas, and snapshots, preventing unbounded storage growth in the data warehouse.

Usage

Adjust TTL settings when:

  • Long-running PR review cycles require environments to persist beyond 1 week
  • High-volume development teams need shorter TTL to reduce storage costs
  • Compliance requirements mandate longer retention of historical versions
  • Testing different retention policies for cost optimization
  • Managing warehouse storage limits in cloud environments

The Insight (Rule of Thumb)

  • Action: Accept 1 week default TTL for dev environments; increase for long PR cycles; decrease for high-volume teams
  • Value: DEFAULT_SNAPSHOT_TTL = "in 1 week", DEFAULT_ENVIRONMENT_TTL = "in 1 week", DATA_VERSION_LIMIT = 10
  • Trade-off: Shorter TTL saves storage but may expire active work; longer TTL accumulates warehouse clutter

Reasoning

Development workflows in data transformation typically involve rapid iteration with many temporary environments and snapshot versions. Without automatic cleanup, these artifacts would accumulate indefinitely, consuming warehouse storage and making the state database unwieldy. The 1 week default strikes a balance:

1. Long enough for typical PR review cycles (a few days) 2. Short enough to prevent significant storage waste 3. Automatically cleans up abandoned experiments and stale branches

The DATA_VERSION_LIMIT of 10 provides an additional safeguard, ensuring that even frequently-updated models don't accumulate unlimited historical versions. This is critical for high-churn models that might otherwise bypass the TTL mechanism by staying "active."

Other related defaults that support this strategy:

  • DEFAULT_MAX_LIMIT = 1000 (row limit for model evaluation)
  • DEFAULT_LOG_LIMIT = 20 (number of logs to keep)

These limits work together to keep SQLMesh's operational footprint manageable across both the data warehouse and the state database.

Code Evidence

# sqlmesh/core/constants.py:20-23

DEFAULT_SNAPSHOT_TTL = "in 1 week"
"""Default snapshot TTL"""
DEFAULT_ENVIRONMENT_TTL = "in 1 week"
"""Default environment TTL"""
# sqlmesh/core/constants.py (other related limits)

DATA_VERSION_LIMIT = 10
"""Data version limit"""
DEFAULT_MAX_LIMIT = 1000
"""The default maximum row limit that is used when evaluating a model."""
DEFAULT_LOG_LIMIT = 20
"""The default number of logs to keep."""

Configuration example:

# config.yaml
default_snapshot_ttl: in 2 weeks
default_environment_ttl: in 2 weeks

Janitor cleanup process reference:

# sqlmesh/core/janitor.py
# Handles cleanup of expired:
# - Views
# - Schemas
# - Snapshots
# Based on TTL settings

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment