Principle:Duckdb Duckdb Third Party Dependency Compilation
Metadata
| Field | Value |
|---|---|
| Type | Principle |
| Sources | CMake add_subdirectory documentation, static linking best practices
|
| Domains | Build_System |
Overview
Third-Party Dependency Compilation is the practice of compiling vendored (in-tree) third-party source code into static libraries as part of a project's own build process. Rather than relying on system-installed shared libraries or external package managers at build time, the project carries its dependencies in a dedicated directory (commonly third_party/) and compiles them from source into static archives. These archives are then linked directly into the final binaries, producing self-contained executables and libraries with no external runtime dependencies.
Description
The Vendoring Approach
Vendoring means copying the source code of a dependency into the consuming project's repository. The vendored copy lives alongside the project's own source code and is compiled using the project's own build system.
This approach differs from two common alternatives:
| Strategy | Mechanism | Trade-offs |
|---|---|---|
| System packages | apt install libfoo-dev, brew install foo |
Depends on host environment; version may differ across machines |
| Build-time fetch | CMake FetchContent, ExternalProject, Conan, vcpkg |
Requires network access; reproducibility depends on remote availability |
| Vendoring | Source committed in-tree under third_party/ |
Fully self-contained; larger repository; update burden on maintainers |
Why Static Linking of Third-Party Libraries?
Compiling vendored dependencies into static libraries (rather than shared/dynamic libraries) provides several benefits:
- Self-contained binaries -- The final executable or library carries all required code. There are no
.soor.dllfiles to distribute alongside it, and no risk of missing or incompatible system libraries at runtime. - Reproducibility -- Every build uses the exact same dependency source code, regardless of what is installed on the host system. Two developers building from the same commit always link against the same dependency versions.
- Simplified distribution -- Users and downstream packagers receive a single artefact with no transitive dependency chain to satisfy.
- Build isolation -- The project's vendored copy can carry patches, configuration tweaks, or build-flag overrides without affecting any system-wide installation of the same library.
Trade-offs
Vendoring and static linking are not without costs:
- Repository size -- Carrying full source trees of dependencies increases the size of the repository.
- Update burden -- Upgrading a dependency requires manually replacing the vendored copy and verifying compatibility.
- Duplication -- If multiple projects on the same system vendor the same library, disk and memory usage increase because the code is duplicated in every binary.
- License compliance -- Vendored source must comply with its license terms, and attribution must be maintained.
Usage
Third-party dependency compilation occurs at a specific point in the build lifecycle:
- During the build, before core library linking -- The build system first compiles each vendored dependency into its own static archive. These archives are then listed as link inputs when the main project targets (libraries, executables) are linked.
- Triggered automatically -- Because the vendored libraries are declared as build targets in the project's CMake files, the build system's dependency graph ensures they are compiled before any target that depends on them.
- No separate invocation required -- Developers do not run a separate "install dependencies" step. The standard
cmake --build .(ormake) command compiles both the vendored dependencies and the project itself.
Theoretical Basis
Static vs. Dynamic Linking
| Aspect | Static Linking | Dynamic Linking |
|---|---|---|
| Artefact | Code is copied into the final binary at link time | Code remains in a separate .so/.dll loaded at runtime
|
| Distribution | Single self-contained file | Binary + shared libraries must be co-distributed |
| Startup | No runtime symbol resolution overhead | Dynamic linker resolves symbols at load time |
| Updates | Dependency update requires relinking | Shared library can be replaced without relinking |
| Binary size | Larger (includes all dependency code) | Smaller (references external code) |
For a project like DuckDB that is distributed as an embeddable library and standalone CLI tool across many platforms, static linking of vendored dependencies is the pragmatic default. It eliminates an entire class of "works on my machine" deployment issues.
Vendoring Trade-offs in Practice
The decision to vendor is driven by the project's priorities:
- Portability -- DuckDB targets Linux, macOS, Windows, WebAssembly, and more. Vendoring avoids per-platform packaging of transitive dependencies.
- Embeddability -- As an in-process database, DuckDB is linked into other applications. Minimising external dependencies simplifies integration for downstream consumers.
- Determinism -- CI/CD pipelines produce bit-identical builds regardless of host package state.