Environment:MaterializeInc Materialize Buildkite CI Runtime
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, CI_CD |
| Last Updated | 2026-02-08 21:00 GMT |
Overview
Buildkite-based CI runtime environment providing agent queues on Hetzner and AWS infrastructure with x86_64 and aarch64 architectures for pipeline generation, Docker image building, and test execution.
Description
This environment defines the CI infrastructure used by the Materialize project. It is centered on Buildkite as the CI orchestrator, with agents distributed across Hetzner bare-metal and AWS EC2 instances. The environment supports dual-architecture builds (x86_64 and aarch64) using a ci-builder Docker image that encapsulates all build dependencies. The bootstrap process (ci/mkpipeline.sh) dynamically generates the pipeline YAML and uploads it via the buildkite-agent CLI.
Usage
Use this environment for all CI pipeline execution, including pipeline generation (mkpipeline), Docker image builds, integration tests, and release processes. It is the mandatory prerequisite for running Mkpipeline_Sh, Mkpipeline_Main, and Trim_Tests_Pipeline implementations.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Linux (Ubuntu-based) | Both x86_64 and aarch64 architectures |
| Hardware | Hetzner bare-metal or AWS EC2 | Queues range from 2cpu-4gb to dedi-48cpu-192gb |
| Disk | SSD with sufficient space | Docker image builds require significant disk I/O |
| Network | Internet access to GHCR and Docker Hub | Required for image pulls and pushes |
Dependencies
System Packages
- `buildkite-agent` (CLI for pipeline upload, artifact management, metadata)
- `docker` (Docker Engine for image builds and container execution)
- `docker-compose` or `docker compose` (for mzcompose test orchestration)
- `git` (for repository operations and merge-base detection)
- `bash` (for bootstrap scripts)
Python Packages
- `requests` (HTTP client for Docker Hub / GHCR API calls)
- `pyyaml` (YAML parsing for pipeline templates)
- `materialize` (internal Python package, installed via `bin/pyactivate`)
Credentials
The following environment variables must be available in the CI environment:
Buildkite Agent Variables (auto-set by agent):
- `BUILDKITE`: Set to `"true"` when running in Buildkite
- `BUILDKITE_AGENT_META_DATA_AWS_INSTANCE_TYPE`: EC2 instance type metadata
- `BUILDKITE_PULL_REQUEST`: PR number or `"false"`
- `BUILDKITE_BUILD_NUMBER`: Numeric build identifier
- `BUILDKITE_BUILD_ID`: Unique build UUID
- `BUILDKITE_PIPELINE_DEFAULT_BRANCH`: Default branch (typically `main`)
- `BUILDKITE_BRANCH`: Current branch being built
- `BUILDKITE_COMMIT`: Git commit SHA
- `BUILDKITE_TAG`: Git tag (empty string if none)
- `BUILDKITE_BUILD_URL`: URL to the build in Buildkite UI
- `BUILDKITE_JOB_ID`: Unique job identifier
- `BUILDKITE_STEP_KEY`: Step key within the pipeline
- `BUILDKITE_STEP_ID`: Step identifier (same across retries)
- `BUILDKITE_RETRY_COUNT`: Number of retries for current step
- `BUILDKITE_LABEL`: Human-readable step label
- `BUILDKITE_ORGANIZATION_SLUG`: Organization slug for API calls
- `BUILDKITE_PIPELINE_SLUG`: Pipeline slug for API calls
- `BUILDKITE_PARALLEL_JOB`: Parallel job index (0-based)
- `BUILDKITE_PARALLEL_JOB_COUNT`: Total parallel jobs
- `BUILDKITE_BUILD_AUTHOR`: Author of the build trigger
CI Control Variables:
- `CI_SANITIZER`: Sanitizer mode (`none`, `address`, `thread`, etc.)
- `CI_PRIORITY`: Pipeline priority override
- `CI_TEST_IDS`: Comma-separated test step IDs for selective execution
- `CI_TEST_SELECTION`: Comma-separated test step names
- `CI_COVERAGE_ENABLED`: Enable code coverage instrumentation
- `CI_FORCE_SWITCH_TO_AWS`: Force all jobs off Hetzner to AWS
- `CI_SYSTEM_PARAMETERS`: Set to `random` for random system parameter permutation
- `CI_RELEASE_LTO_BUILD`: Enable LTO builds for releases
- `MZ_DEV_CI_BUILDER`: Set when running inside the ci-builder container
- `MZ_DEV_CI_BUILDER_ARCH`: Architecture override for ci-builder
Docker Registry Credentials:
- `DOCKERHUB_USERNAME`: DockerHub username for authenticated API calls
- `DOCKERHUB_ACCESS_TOKEN`: DockerHub access token for rate-limit avoidance
Quick Install
# Bootstrap is handled by the Buildkite agent; ci-builder contains all deps.
# To run locally, use the ci-builder wrapper:
bin/ci-builder run stable bin/pyactivate -m ci.mkpipeline test
Code Evidence
Buildkite environment detection from `buildkite.py:59-61`:
def is_in_buildkite() -> bool:
return ui.env_is_truthy("BUILDKITE")
Pull request heuristic from `buildkite.py:63-81`:
def is_in_pull_request() -> bool:
"""Note that this is a heuristic."""
if not is_in_buildkite():
return False
if is_pull_request_marker_set():
return True
if is_on_default_branch():
return False
if git.is_on_release_version():
return False
if git.contains_commit("HEAD", "main"):
return False
return True
CI-builder entry from `xcompile.py:198-213`:
def _enter_builder(arch: Arch, channel: str | None = None) -> list[str]:
if "MZ_DEV_CI_BUILDER" in os.environ or sys.platform == "darwin":
return []
else:
default_channel = (
"stable"
if Sanitizer[os.getenv("CI_SANITIZER", "none")] == Sanitizer.none
else "nightly"
)
return [
"env", f"MZ_DEV_CI_BUILDER_ARCH={arch}",
"bin/ci-builder", "run",
channel if channel else default_channel,
]
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| Agent connection lost (exit_status -1) | Buildkite agent lost connection to the server | Automatic retry (limit 2) is configured via `set_retry_on_agent_lost` |
| Agent stopped (signal_reason: agent_stop) | OS terminated the Buildkite agent | Automatic retry (limit 2) is configured |
| Exit status 128 | Temporary GitHub connection issue during git operations | Automatic retry (limit 2) is configured |
| Exit status 199 | Rust ICE (Internal Compiler Error) | Automatic retry (limit 2); see rust-lang/rust#148581 |
| Annotation body must be less than 1 MB | Buildkite annotation size limit exceeded | Annotations are auto-truncated to 900,000 characters |
Compatibility Notes
- Hetzner aarch64 availability: Known availability issues with Hetzner aarch64 agents. The pipeline automatically detects stuck queues (>20 minutes wait) and falls back to x86_64 Hetzner or AWS agents.
- CI_FORCE_SWITCH_TO_AWS: Emergency switch to move all jobs from Hetzner to AWS when Hetzner infrastructure is entirely broken.
- Sanitizer builds: Require the Rust nightly toolchain and automatically escalate agent sizes (e.g., `linux-aarch64-small` → `linux-aarch64`) and multiply timeouts by 10x.
- Coverage builds: Multiply timeouts by 3x and disable parallelism for cargo-test.
- macOS cross-compilation: Supported via Homebrew-installed cross-compilation toolchains (lld, materializeinc/crosstools).
Related Pages
- Implementation:MaterializeInc_Materialize_Mkpipeline_Sh
- Implementation:MaterializeInc_Materialize_Mkpipeline_Main
- Implementation:MaterializeInc_Materialize_Trim_Tests_Pipeline
- Implementation:MaterializeInc_Materialize_Pipeline_Template_Step
- Implementation:MaterializeInc_Materialize_Annotate_Logged_Errors