Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Vespa engine Vespa Version Preparation

From Leeroopedia


CI_CD Build_Systems

Overview

Version preparation ensures all build artifacts share a consistent version string by replacing snapshot/development versions in Maven POM files with a release version. This is a standard practice in CI/CD pipelines to produce deterministic, versioned builds. In the Vespa build pipeline, version preparation is the first stage executed before any compilation occurs, guaranteeing that every downstream artifact -- JARs, RPMs, and container images -- carries the same version identifier.

Motivation

During development, Maven modules use SNAPSHOT version suffixes to indicate that the code is in flux and not yet released. When a CI/CD pipeline produces a release build, all of these snapshot references must be replaced with the target version number. Without this step, artifacts would carry inconsistent or non-deterministic version strings, making it impossible to trace a deployed binary back to a specific build.

Version preparation addresses three core concerns:

  • Reproducibility: A given version string always corresponds to the same set of source code and build inputs.
  • Traceability: Operators and developers can map a running binary back to its exact build by inspecting the embedded version.
  • Dependency consistency: Internal module references within a multi-module Maven project must all resolve to the same version to avoid classpath conflicts.

How It Works

The version preparation phase operates through two scripts that run sequentially:

Step 1: POM Version Replacement

The script replace-vespa-version-in-poms.sh traverses the entire source tree using find and applies sed substitutions to every pom.xml file. Three distinct patterns are matched and replaced:

Pattern Purpose Example Before Example After
<version>.*SNAPSHOT.*</version> Module version <version>8.999.1-SNAPSHOT</version> <version>8.432.17</version>
<vespaversion>.*project.version.*</vespaversion> Vespa dependency version <vespaversion>${project.version}</vespaversion> <vespaversion>8.432.17</vespaversion>
<test-framework.version>.*project.version.*</test-framework.version> Test framework version <test-framework.version>${project.version}</test-framework.version> <test-framework.version>8.432.17</test-framework.version>

The script follows symbolic links (find -L) to ensure all POM files are discovered, including those in symlinked module directories. It also handles platform differences by detecting whether it is running on macOS (BSD sed) or Linux (GNU sed).

Step 2: Artifact Directory Creation

After POM replacement, the prepare.sh script creates the directory structure needed by subsequent build stages:

$WORKDIR/
  artifacts/
    $ARCH/
      rpms/         # Will hold built RPM packages
      maven-repo/   # Will hold Maven repository artifacts

The $ARCH variable (e.g., x86_64 or aarch64) ensures that artifacts for different CPU architectures are kept separate, which is essential for the multi-architecture build matrix.

Environment Variables

Variable Description Example Value
VESPA_VERSION The target version string to inject into all POM files 8.432.17
SOURCE_DIR Root directory of the Vespa source checkout /vespa
WORKDIR Working directory for build outputs and intermediate files /tmp/vespa-build
ARCH CPU architecture label for artifact partitioning x86_64

Design Considerations

In-place modification vs. copy-on-write: The Vespa pipeline modifies POM files in-place using sed -i. This is acceptable because CI/CD builds operate on a fresh checkout that is discarded after the build. An alternative approach would be to copy the source tree and modify the copy, but this would double disk usage for a large repository like Vespa (thousands of POM files).

Fail-fast behavior: Both scripts use set -o errexit, set -o nounset, and set -o pipefail to terminate immediately on any error. This is critical for version preparation because a partial replacement could produce a build with mixed versions -- a failure mode that would be difficult to diagnose downstream.

Idempotency: The sed patterns are designed so that running the replacement twice with the same version produces the same result. The regex .*SNAPSHOT.* is broad enough to match any existing version string containing "SNAPSHOT", and the replacement is a literal version string.

Relationship to Other Build Stages

Version preparation must complete before any other build stage begins. The pipeline dependency graph is:

Version Preparation
  |
  +-- Java Bootstrap and Maven Build
  |
  +-- CMake Configuration --> C++ Compilation
  |
  +-- RPM Package Creation --> Container Image Building --> Artifact Signing and Publishing

See Also

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment