Heuristic:MaterializeInc Materialize Pipeline Test Trimming Rules
| Knowledge Sources | |
|---|---|
| Domains | CI_CD, Optimization |
| Last Updated | 2026-02-08 21:00 GMT |
Overview
CI optimization technique that skips pipeline steps whose code inputs have not changed relative to the main branch, using mzbuild image dependencies and explicit input globs to determine which tests to run.
Description
The `trim_tests_pipeline()` function in `ci/mkpipeline.py` analyzes each pipeline step to determine if any of its inputs have changed since the merge base with main. Inputs are derived from two sources: explicit glob patterns in the step's `inputs` key, and implicit mzbuild image dependencies discovered via the mzcompose plugin. Additionally, Python module imports are transitively traced using `bin/ci-python-imports` to ensure all dependent code is considered. Steps with unchanged inputs are trimmed (skipped) from the pipeline.
Usage
Apply this heuristic when understanding why certain CI steps were skipped on a pull request, or when adding new test steps that need proper input declarations. It is critical for the Trim_Tests_Pipeline implementation and is the primary mechanism for reducing CI runtime on pull requests.
The Insight (Rule of Thumb)
- Action: Declare all code dependencies (inputs) for each pipeline step; the system auto-trims steps with unchanged inputs.
- Value: Reduces CI runtime significantly on PRs that only touch a subset of the codebase.
- Trade-off: If inputs are incorrectly declared, tests may be skipped when they should run. Safety net: CI glue code changes bypass trimming entirely.
Key rules:
- Pipeline trimming is only active on PRs (not on `main` branch, tags, coverage builds, or sanitizer builds).
- If any file in `CI_GLUE_GLOBS` (`bin`, `ci`, `misc/python/materialize/cli/ci_annotate_errors.py`) has changed, trimming runs in dry-run mode only (all steps still execute).
- Transitive step dependencies are preserved: if step B depends on step A and step B has changed inputs, both A and B are retained.
- Python imports of mzcompose compositions are automatically traced as implicit inputs.
Reasoning
Running the full test suite on every PR is wasteful when most PRs only touch a small fraction of the codebase. The trimming system provides a safe mechanism by:
- Using Git's diff to detect changed files relative to the merge base
- Leveraging the mzbuild dependency graph to understand which images (and thus tests) are affected by code changes
- Preserving transitive dependencies to avoid broken dependency chains
- Disabling trimming on `main` and release builds where full coverage is essential
- Running a dry-run trim when CI glue code changes, to validate the trimming logic without skipping tests
The `CI_GLUE_GLOBS` safety net is intentionally broad (includes all of `bin/` and `ci/`) because under-declaring glue code dependencies risks silently skipping tests.
Code Evidence
CI glue glob definition from `mkpipeline.py:54-60`:
CI_GLUE_GLOBS = ["bin", "ci", "misc/python/materialize/cli/ci_annotate_errors.py"]
Trimming decision logic from `mkpipeline.py:152-177`:
if args.pipeline == "test" and not os.getenv("CI_TEST_IDS"):
if args.coverage or args.sanitizer != Sanitizer.none:
print("Coverage/Sanitizer build, not trimming pipeline")
elif os.environ["BUILDKITE_BRANCH"] == "main" or os.environ["BUILDKITE_TAG"]:
print("On main branch or tag, so not trimming pipeline")
elif have_paths_changed(CI_GLUE_GLOBS):
print("[DRY RUN] Trimming unchanged steps from pipeline")
trim_tests_pipeline(copy.deepcopy(pipeline), ...)
else:
print("Trimming unchanged steps from pipeline")
trim_tests_pipeline(pipeline, ...)
Input change detection from `mkpipeline.py:766-777`:
for step in steps.values():
inputs = step.inputs()
if not inputs:
# No inputs = "diff nothing", not "diff everything"
continue
if have_paths_changed(inputs):
changed.add(step.id)