Implementation:Vespa engine Vespa Cpp Sh
Overview
This page documents the implementation of the Vespa C++ compilation script: .buildkite/cpp.sh. This script activates the GCC toolset, sets up the dependency PATH, and runs make with parallel threads to compile all C++ components of Vespa.
Type: External Tool Doc
Code Reference
#!/usr/bin/env bash
# .buildkite/cpp.sh (L1-29)
set -o errexit
set -o nounset
set -o pipefail
if [[ -n "${DEBUG:-}" ]]; then
set -o xtrace
fi
mydir=${0%/*}
shlim=${mydir}/show-limits.sh
if [ -x "${shlim}" ]; then
"${shlim}" || echo "failed: ${shlim}"
fi
echo "--- Building C++ components"
# shellcheck disable=1091
source /etc/profile.d/enable-gcc-toolset.sh
PATH=/opt/vespa-deps/bin:$PATH
cd "$SOURCE_DIR"
echo "Running make with $NUM_CPP_THREADS threads..."
make -j "$NUM_CPP_THREADS"
I/O Contract
Inputs (Environment Variables)
| Variable | Required | Description | Example |
|---|---|---|---|
NUM_CPP_THREADS |
Yes | Number of parallel compilation threads for make -j |
16
|
SOURCE_DIR |
Yes | Root directory of the Vespa source checkout | /vespa
|
DEBUG |
No | If set to a non-empty value, enables bash xtrace for debugging | 1
|
Inputs (System Dependencies)
| Dependency | Path | Purpose |
|---|---|---|
| GCC Toolset | /etc/profile.d/enable-gcc-toolset.sh |
Activates GCC 12+ with C++20 support |
| Vespa Dependencies | /opt/vespa-deps/bin |
Pre-built third-party libraries (protobuf, abseil, gRPC, etc.) |
| Generated Makefiles | Current working directory | Output from the CMake configuration stage |
| show-limits.sh | .buildkite/show-limits.sh |
Optional resource limit reporter |
Inputs (Prerequisite Build Artifacts)
| Artifact | Produced By | Purpose |
|---|---|---|
| Makefiles and CMakeCache.txt | CMake configuration stage | Build rules for all C++ targets |
| Java JAR files | Java bootstrap stage | JNI headers and test dependencies |
Outputs (Compiled Artifacts)
| Output | Type | Description |
|---|---|---|
*.so files |
Shared libraries | Vespa's native libraries (search core, document storage, etc.) |
| Executable binaries | Executables | Server processes and CLI utilities |
*.a files |
Static libraries | Libraries statically linked into other targets |
| Test binaries | Executables | Unit test programs for subsequent test stages |
Key Implementation Details
Resource Limit Reporting
Before starting compilation, the script optionally runs show-limits.sh:
mydir=${0%/*}
shlim=${mydir}/show-limits.sh
if [ -x "${shlim}" ]; then
"${shlim}" || echo "failed: ${shlim}"
fi
The ${0%/*} pattern extracts the directory containing the script itself. If show-limits.sh exists and is executable, it runs and logs system resource limits (ulimits, memory, file descriptors). If it fails, the error is logged but the build continues -- this is purely diagnostic.
GCC Toolset Activation
source /etc/profile.d/enable-gcc-toolset.sh
This script is provided by the build container and activates a modern GCC version (12+) that supports C++20. Without this, the system default compiler (often GCC 8 on CentOS/AlmaLinux 8) would be used, which lacks full C++20 support.
Dependency PATH Setup
PATH=/opt/vespa-deps/bin:$PATH
Pre-built Vespa dependencies are installed under /opt/vespa-deps/. Adding this to the PATH ensures that tools like protoc (Protocol Buffer compiler) and other code generators are found during compilation.
Parallel Make Invocation
cd "$SOURCE_DIR"
make -j "$NUM_CPP_THREADS"
The script changes to $SOURCE_DIR and runs make with the -j flag for parallel execution. Key aspects:
- The Makefiles in the build directory were generated by the preceding CMake configuration stage.
make -j "$NUM_CPP_THREADS"launches up to$NUM_CPP_THREADSconcurrent compilation jobs.- Make's dependency tracking ensures that targets are built in the correct order -- a shared library is not linked until all its object files are compiled.
- The
set -o errexitflag causes the script to terminate immediately ifmakereturns a non-zero exit code (indicating a compilation or linking error).
Working Directory
The script changes to $SOURCE_DIR before running make. This is the directory where CMake was configured (the build directory), which contains the generated Makefiles. The build directory may be the same as the source directory (in-source build) or a separate directory (out-of-source build), depending on pipeline configuration.
Execution Context
The compilation script is invoked after CMake configuration completes:
Buildkite Pipeline
--> .buildkite/bootstrap.sh (Java bootstrap)
--> .buildkite/bootstrap-cmake.sh (CMake configuration)
--> .buildkite/cpp.sh
--> source enable-gcc-toolset.sh
--> PATH=/opt/vespa-deps/bin:$PATH
--> cd $SOURCE_DIR
--> make -j $NUM_CPP_THREADS
The compilation step is typically the longest stage in the pipeline, taking 20-60 minutes depending on the number of available CPU cores.
Source File Locations
.buildkite/cpp.sh(Lines 1-29)
See Also
- C++ Compilation Principle -- The design rationale for parallel C++ compilation.
- CMake Configuration Implementation -- The preceding stage that generates Makefiles.
- RPM Package Creation Implementation -- The next stage that packages compiled binaries.