Principle:Vespa engine Vespa RPM Package Creation
Overview
RPM package creation bundles compiled binaries and configuration files into installable RPM packages. The process generates a source RPM from a spec file, rebuilds it into binary RPMs with zstd compression, then creates a YUM repository for distribution. This stage takes the raw build outputs from the C++ compilation and Java build stages and produces self-contained, versioned packages that can be installed on Red Hat-based Linux distributions.
Motivation
RPM (Red Hat Package Manager) packages are the standard mechanism for distributing software on RHEL, CentOS, AlmaLinux, and Fedora systems. Packaging Vespa as RPMs provides:
- Dependency management: RPM packages declare their runtime dependencies (e.g., JDK, system libraries), and the package manager ensures these are satisfied before installation.
- Atomic installation and upgrades: The RPM system supports atomic install, upgrade, and rollback operations, making it safe to update Vespa on production systems.
- Versioned distribution: Each RPM carries a version number, architecture, and release tag, enabling operators to track exactly which build is deployed.
- Repository-based distribution: Creating a YUM repository allows hosts to install Vespa using standard tools (
yum install vespaordnf install vespa).
How It Works
The RPM build process follows three steps:
Step 1: Source RPM Generation
make -f .copr/Makefile srpm outdir="$WORKDIR"
The .copr/Makefile contains rules for generating a source RPM (SRPM) from the Vespa spec file and source tarball. The SRPM is a self-contained package that includes the spec file and all source materials needed to reproduce the build. This follows the standard Copr (Community Projects) build convention used by Fedora and Red Hat.
Step 2: Binary RPM Rebuild
rpmbuild --rebuild \
--define="_topdir $WORKDIR/vespa-rpmbuild" \
--define "_debugsource_template %{nil}" \
--define "_binary_payload w10T4.zstdio" \
--define "installdir $WORKDIR/vespa-install" \
"$WORKDIR"/vespa-"$VESPA_VERSION"-*.src.rpm
The rpmbuild --rebuild command takes the SRPM and produces binary RPM packages. Key options:
| Option | Purpose |
|---|---|
_topdir |
Sets the RPM build tree root directory, keeping build artifacts isolated |
_debugsource_template %{nil} |
Disables debug source package generation to reduce build time and disk usage |
_binary_payload w10T4.zstdio |
Uses zstd compression at level 10 with 4 threads for the RPM payload |
installdir |
Points to the pre-built installation directory, avoiding a full recompilation during RPM build |
Step 3: Repository Creation
mv "$WORKDIR"/vespa-rpmbuild/RPMS/*/*.rpm "$WORKDIR/artifacts/$ARCH/rpms"
createrepo "$WORKDIR/artifacts/$ARCH/rpms"
After building, all RPM files are moved to the artifact directory and createrepo generates the YUM repository metadata (repodata directory with XML files describing available packages).
Environment Variables
| Variable | Description | Example Value |
|---|---|---|
WORKDIR |
Working directory for build artifacts | /tmp/vespa-build
|
VESPA_VERSION |
Version string embedded in RPM package names | 8.432.17
|
ARCH |
CPU architecture for the RPM packages | x86_64 or aarch64
|
Design Considerations
Zstd compression: The w10T4.zstdio payload setting specifies Zstandard compression at level 10 using 4 threads. Zstd provides a significantly better compression ratio than the traditional gzip used by RPM, while also being faster to decompress. The multi-threaded compression reduces the time to create the RPM payload for large packages like Vespa.
Debug source suppression: The _debugsource_template %{nil} define disables the generation of debugsource RPMs. Vespa's debug information is large, and generating debug source packages would significantly increase build time and storage requirements. Debug information is still available in the main debuginfo RPM if needed.
Pre-built install directory: The installdir define tells the RPM spec file to use pre-built binaries from a previous make install step rather than recompiling from source during RPM build. This is a common optimization in CI/CD pipelines where the compile step has already completed.
Core dump suppression: The script sets ulimit -c 0 to disable core dumps during the RPM build process. This prevents the build from consuming disk space with large core files if a process crashes during packaging.
RPM Package Structure
The Vespa RPM build typically produces several packages:
| Package | Contents |
|---|---|
vespa |
Main Vespa package with all binaries and libraries |
vespa-config-model-fat |
Self-contained config model JAR for cloud deployments |
vespa-debuginfo |
Debug symbols for the Vespa binaries |
Relationship to Other Build Stages
RPM package creation depends on both the Java build (for JAR artifacts) and C++ compilation (for native binaries):
Java Build ------+
+--> RPM Package Creation --> Container Image Building
C++ Compilation -+ |
+--> Artifact Signing and Publishing
The produced RPMs are consumed by two downstream stages:
- Container Image Building: Installs RPMs into Docker images.
- Artifact Signing and Publishing: Signs and uploads RPMs to persistent storage.
See Also
- RPM Build Implementation -- The implementation details of the build-rpms.sh script.
- C++ Compilation -- The preceding stage that produces compiled binaries.
- Container Image Building -- The next stage that consumes RPMs.
- Source: build-rpms.sh