Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Vespa engine Vespa Java Bootstrap and Maven Build

From Leeroopedia


CI_CD Build_Systems

Overview

Java bootstrap and Maven build compiles all Java modules, resolves dependencies from Maven Central, and produces JAR artifacts. The bootstrap phase sets up the Maven wrapper and builds dependency-version POMs first, while the main build uses parallel compilation threads. In the Vespa build pipeline, this stage produces the Java artifacts that are later packaged into RPMs and container images.

Motivation

Vespa is a large-scale distributed system with hundreds of Java modules organized in a multi-module Maven project. Compiling this codebase requires careful orchestration:

  • Plugin resolution order: Maven cannot resolve references to a plugin if the same reactor build also builds that plugin. Therefore, plugins must be built first in a separate pass.
  • Dependency hierarchy: Parent POMs and dependency-version POMs must be installed into the local repository before child modules can resolve their dependencies.
  • Build performance: With hundreds of modules, sequential compilation would be prohibitively slow. Parallel thread pools are essential to achieve reasonable build times.
  • C++ test support: Some C++ unit tests require Java JAR files. The bootstrap phase collects these JARs into a dedicated directory.

How It Works

The Java build proceeds in two distinct phases: bootstrap and main build.

Phase 1: Bootstrap

The bootstrap phase is handled by the root-level bootstrap.sh script, which supports multiple modes:

Mode Description When Used
wrapper Only set up the Maven wrapper Minimal setup
java Build only Maven plugins Plugin-only builds
default Build plugins and minimal modules needed by CMake Standard C++ builds
full Build plugins and all modules needed by C++ tests Full CI builds

The bootstrap sequence for the full mode is:

  1. Maven wrapper setup: Installs the Maven wrapper (mvnw) using Maven 3.9.12. This ensures all builds use the same Maven version regardless of what is installed on the build host.
  2. Parent POM installation: Builds and installs dependency-versions, container-dependency-versions, and parent POMs.
  3. Root POM installation: Installs the root POM (-N flag for non-recursive).
  4. Plugin build: Builds all custom Maven plugins under maven-plugins/.
  5. C++ test dependencies: Builds jrt, linguistics, and messagebus modules with tests and Javadoc skipped for speed.

Phase 2: Main Maven Build

The main build is handled by .buildkite/java.sh, which invokes the Maven wrapper with parallel threads:

./mvnw -T "$NUM_MVN_THREADS" "${MVN_EXTRA_OPTS[@]}" "$VESPA_MAVEN_TARGET"

The -T flag controls Maven's parallel thread pool. Each thread independently builds a module and its transitive dependencies, with Maven managing the dependency graph to ensure correct ordering.

The mvn_install Function

Both phases use a shared mvn_install function that provides consistent Maven invocation:

mvn_install() {
    ${MAVEN_CMD} --batch-mode --no-snapshot-updates \
        -Dmaven.wagon.http.retryHandler.count=5 \
        clean "${MAVEN_TARGET}" ${MAVEN_EXTRA_OPTS} "$@"
}

Key flags:

  • --batch-mode: Disables interactive prompts and produces output suitable for CI logs.
  • --no-snapshot-updates: Prevents Maven from checking remote repositories for updated SNAPSHOT artifacts, since the version preparation stage already replaced all SNAPSHOTs.
  • -Dmaven.wagon.http.retryHandler.count=5: Retries failed HTTP requests up to 5 times, improving resilience against transient network failures when downloading dependencies.

Environment Variables

Variable Description Example Value
SOURCE_DIR Root directory of the Vespa source checkout /vespa
VESPA_CPP_TEST_JARS Directory where JAR files for C++ tests are collected /tmp/vespa-build/test-jars
NUM_MVN_THREADS Number of parallel Maven threads 4
VESPA_MAVEN_TARGET Maven lifecycle target to execute install
VESPA_MAVEN_EXTRA_OPTS Additional Maven options (e.g., -Dmaven.test.skip=true) -Dmaven.test.skip=true
VESPA_MAVEN_COMMAND Override for the Maven command (defaults to ./mvnw) ./mvnw
MAVEN_OPTS JVM options for the Maven process -Xms256m -Xmx2g

Design Considerations

Two-phase build: The separation into bootstrap and main build is a deliberate design choice forced by Maven's inability to resolve plugin references within a single reactor build. This is a well-known limitation of Maven that affects any project with custom plugins.

GCC toolset activation: The Java build sources /etc/profile.d/enable-gcc-toolset.sh before running Maven. This is because some Java modules include JNI (Java Native Interface) code that requires a C++ compiler. The GCC toolset ensures that the correct compiler version is available.

JAR collection for C++ tests: The bootstrap script uses a find-and-xargs pipeline to copy all JAR files from Maven target/ directories into a single flat directory. This allows C++ test binaries to locate Java dependencies without understanding the Maven module structure.

Relationship to Other Build Stages

The Java bootstrap and Maven build stage depends on version preparation completing first. Its outputs feed into:

  • C++ Compilation: C++ tests require JAR files collected during bootstrap.
  • RPM Package Creation: RPM packages include compiled Java artifacts.
  • Container Image Building: Container images include the Maven local repository.

See Also

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment