Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Vespa engine Vespa CMake Configuration

From Leeroopedia


CI_CD Build_Systems

Overview

CMake configuration generates a platform-specific build system from declarative CMakeLists.txt files. It resolves compiler toolchains, library dependencies, and build options. In the Vespa CI/CD pipeline, CMake configuration is the bridge between the Java bootstrap phase and C++ compilation, translating high-level build declarations into Makefiles that make can execute.

Motivation

Vespa's C++ codebase consists of hundreds of libraries and executables that depend on both internal and external libraries. Manually writing Makefiles for this scale of project would be unmaintainable. CMake provides:

  • Declarative dependency management: Each module declares its dependencies in CMakeLists.txt, and CMake resolves the full transitive dependency graph.
  • Platform abstraction: CMake detects the compiler, linker, and system libraries automatically, allowing the same build definitions to work across different Linux distributions.
  • Build option configuration: Options such as sanitizer support, ccache integration, and Valgrind testing can be toggled through CMake variables without modifying build files.

How It Works

CMake Invocation

The CMake configuration step is a single invocation that processes the top-level CMakeLists.txt and recursively processes all subdirectories:

cmake -DVESPA_UNPRIVILEGED=no \
      -DVALGRIND_UNIT_TESTS="$VALGRIND_UNIT_TESTS" \
      "$VESPA_CMAKE_SANITIZERS_OPTION" \
      "$VESPA_CMAKE_CCACHE_OPTION" \
      "$SOURCE_DIR"

This invocation runs from a separate build directory (out-of-source build), keeping generated files separate from the source tree.

Key CMake Variables

Variable Type Description Default
VESPA_UNPRIVILEGED Boolean Whether to build for unprivileged (non-root) installation no
VALGRIND_UNIT_TESTS Boolean Whether to run unit tests under Valgrind for memory checking true (unless PR or sanitizer build)
VESPA_USE_SANITIZER String Address/thread/undefined behavior sanitizer to enable null (none)
VESPA_USE_CCACHE Boolean Whether to use ccache for compilation caching true (unless sanitizer is active)

Sanitizer and Ccache Interaction

When a sanitizer is enabled, ccache is automatically disabled. This is because sanitizer builds produce different object code that should not pollute the ccache, and sanitizer-instrumented binaries may have different cache invalidation characteristics. The logic is:

  1. If VESPA_USE_SANITIZER is set to a value other than null, enable the sanitizer CMake option and force ccache off.
  2. If the build is for a pull request (BUILDKITE_PULL_REQUEST != "false"), disable Valgrind tests to speed up the PR validation cycle.

Valgrind Unit Tests

Valgrind is a memory error detector that catches use-after-free, buffer overflows, and memory leaks at runtime. Enabling Valgrind for unit tests significantly increases test execution time (typically 10-50x slower), so it is conditionally disabled for:

  • Pull request builds: Fast feedback is more valuable than exhaustive memory checking.
  • Sanitizer builds: Address Sanitizer (ASan) and Thread Sanitizer (TSan) provide similar memory checking with less overhead, making Valgrind redundant.

Environment Variables

Variable Description Example Value
VESPA_USE_SANITIZER Sanitizer type to enable, or null to disable address, thread, or null
BUILDKITE_PULL_REQUEST Pull request number, or "false" if not a PR build 35801 or false
VALGRIND_UNIT_TESTS Whether to enable Valgrind for unit tests true or false
SOURCE_DIR Root directory of the Vespa source checkout /vespa

Design Considerations

Out-of-source builds: CMake is invoked from a build directory that is separate from the source tree. This keeps generated Makefiles, object files, and binaries out of the source directory, allowing the same source tree to support multiple build configurations simultaneously.

GCC toolset activation: The configuration script sources /etc/profile.d/enable-gcc-toolset.sh to ensure the correct GCC version is available. Vespa requires C++20 support, which necessitates GCC 12 or later (or Clang 16+).

Dependency path setup: The script prepends /opt/vespa-deps/bin to the PATH. This directory contains pre-built Vespa dependency binaries (e.g., protobuf, abseil, gRPC) that CMake's find_package must locate during configuration.

What CMake Produces

After configuration completes, the build directory contains:

  • Makefiles: Generated build rules for every target in the project.
  • CMakeCache.txt: A cache file recording all resolved variables and paths.
  • cmake_install.cmake: Installation rules for make install.
  • config.h / vespa-config.h: Generated header files with platform-specific defines.

These files are consumed by the subsequent C++ compilation stage, which runs make -j against the generated Makefiles.

Relationship to Other Build Stages

CMake configuration depends on the Java bootstrap completing first (because some CMakeLists.txt files reference Java JAR paths). Its output feeds directly into C++ compilation:

Java Bootstrap --> CMake Configuration --> C++ Compilation
                                            |
                                            v
                                     RPM Package Creation

See Also

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment