Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:TobikoData Sqlmesh Production Promotion

From Leeroopedia


Knowledge Sources
Domains Data_Engineering, Deployment, Model_Development
Last Updated 2026-02-07 00:00 GMT

Overview

A controlled workflow for safely promoting validated development changes to production through planning, review, and atomic application of data transformations.

Description

Deploying data transformation changes to production carries significant risk. Unlike stateless application deployments, data pipelines have state—historical data that must remain consistent, downstream dependencies that must not break, and incremental processing that must account for model changes. A naive deployment that simply updates SQL and re-runs models can corrupt historical data, break downstream consumers, or cause expensive full recomputation.

Production promotion provides a safe deployment workflow through a plan-and-apply pattern. The planning phase compares the current production state with proposed changes, determines which models are affected, categorizes changes as breaking or non-breaking, computes the minimal backfill required, and generates a preview of what will happen. This plan becomes a reviewable artifact that can be inspected by engineers or approved through governance workflows before execution.

The application phase executes the plan atomically. It creates new snapshot versions for changed models, backfills necessary historical data, updates environment pointers to the new snapshots, and validates data integrity. Critically, this happens without disrupting production—queries continue using old snapshots until new ones are validated and ready. The atomic environment pointer update means rollback is instant (just revert the pointer), and the framework can maintain multiple versions simultaneously for gradual migration or A/B testing.

Usage

Use production promotion workflows when deploying changes that affect model logic, materialization strategy, or dependencies. Generate plans from feature branches after successful development testing, review plans for breaking changes and backfill requirements, obtain necessary approvals, then apply to production during maintenance windows or through automated CI/CD pipelines. This workflow ensures changes are deliberate, reviewable, and safely rolled out.

Theoretical Basis

The promotion algorithm uses differential snapshot planning and atomic environment updates:

FUNCTION plan_production_promotion(new_code, current_production):
    # Compute fingerprint differences
    new_snapshots = compute_snapshots(new_code)
    prod_snapshots = get_environment_snapshots("prod")

    # Categorize changes
    changes = {
        new_models: [],
        modified_models: [],
        removed_models: []
    }

    FOR EACH model IN new_snapshots:
        IF model NOT IN prod_snapshots THEN
            changes.new_models.append(model)
        ELSE IF new_snapshots[model].fingerprint != prod_snapshots[model].fingerprint THEN
            change_type = categorize_change(
                prod_snapshots[model],
                new_snapshots[model]
            )
            changes.modified_models.append({
                model: model,
                type: change_type,  # breaking, non-breaking, forward-only
                requires_backfill: compute_backfill_needed(
                    model,
                    prod_snapshots[model],
                    new_snapshots[model]
                )
            })
        END IF
    END FOR

    # Compute downstream impact
    FOR EACH change IN changes.modified_models:
        IF change.type = BREAKING THEN
            downstream = get_all_downstream(change.model)
            FOR EACH downstream_model IN downstream:
                # Downstream models need new snapshots
                IF downstream_model NOT IN changes.modified_models THEN
                    changes.modified_models.append({
                        model: downstream_model,
                        type: INDIRECT_BREAKING,
                        requires_backfill: True
                    })
                END IF
            END FOR
        END IF
    END FOR

    # Compute backfill plan
    backfill_plan = compute_minimal_backfill(
        changes.modified_models,
        prod_snapshots,
        start_date,
        end_date
    )

    RETURN Plan(changes, backfill_plan)
END FUNCTION

FUNCTION apply_plan(plan):
    # Validate plan is categorized
    IF plan.has_uncategorized_changes THEN
        RAISE error "Cannot apply plan with uncategorized changes"
    END IF

    # Push new snapshots to state
    FOR EACH snapshot IN plan.new_snapshots:
        state_sync.push_snapshot(snapshot)
    END FOR

    # Execute backfill for changed models
    backfill_scheduler.run(
        snapshots=plan.snapshots_requiring_backfill,
        intervals=plan.backfill_intervals
    )

    # Atomically update production environment
    TRANSACTION:
        old_environment = get_environment("prod")
        new_environment = {
            name: "prod",
            snapshots: merge_snapshots(
                old_environment.snapshots,
                plan.new_snapshots
            ),
            previous_plan_id: plan.plan_id
        }
        state_sync.update_environment("prod", new_environment)
    END TRANSACTION

    # Cleanup old snapshots if no longer referenced
    cleanup_unreferenced_snapshots()

    RETURN success
END FUNCTION

The key insight is separating intent (plan) from execution (apply), with an atomic environment pointer update that enables instant rollback and multi-version coexistence.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment