Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Duckdb Duckdb Third Party Dependency Compilation

From Leeroopedia


Metadata

Field Value
Type Principle
Sources CMake add_subdirectory documentation, static linking best practices
Domains Build_System

Overview

Third-Party Dependency Compilation is the practice of compiling vendored (in-tree) third-party source code into static libraries as part of a project's own build process. Rather than relying on system-installed shared libraries or external package managers at build time, the project carries its dependencies in a dedicated directory (commonly third_party/) and compiles them from source into static archives. These archives are then linked directly into the final binaries, producing self-contained executables and libraries with no external runtime dependencies.

Description

The Vendoring Approach

Vendoring means copying the source code of a dependency into the consuming project's repository. The vendored copy lives alongside the project's own source code and is compiled using the project's own build system.

This approach differs from two common alternatives:

Strategy Mechanism Trade-offs
System packages apt install libfoo-dev, brew install foo Depends on host environment; version may differ across machines
Build-time fetch CMake FetchContent, ExternalProject, Conan, vcpkg Requires network access; reproducibility depends on remote availability
Vendoring Source committed in-tree under third_party/ Fully self-contained; larger repository; update burden on maintainers

Why Static Linking of Third-Party Libraries?

Compiling vendored dependencies into static libraries (rather than shared/dynamic libraries) provides several benefits:

  • Self-contained binaries -- The final executable or library carries all required code. There are no .so or .dll files to distribute alongside it, and no risk of missing or incompatible system libraries at runtime.
  • Reproducibility -- Every build uses the exact same dependency source code, regardless of what is installed on the host system. Two developers building from the same commit always link against the same dependency versions.
  • Simplified distribution -- Users and downstream packagers receive a single artefact with no transitive dependency chain to satisfy.
  • Build isolation -- The project's vendored copy can carry patches, configuration tweaks, or build-flag overrides without affecting any system-wide installation of the same library.

Trade-offs

Vendoring and static linking are not without costs:

  • Repository size -- Carrying full source trees of dependencies increases the size of the repository.
  • Update burden -- Upgrading a dependency requires manually replacing the vendored copy and verifying compatibility.
  • Duplication -- If multiple projects on the same system vendor the same library, disk and memory usage increase because the code is duplicated in every binary.
  • License compliance -- Vendored source must comply with its license terms, and attribution must be maintained.

Usage

Third-party dependency compilation occurs at a specific point in the build lifecycle:

  • During the build, before core library linking -- The build system first compiles each vendored dependency into its own static archive. These archives are then listed as link inputs when the main project targets (libraries, executables) are linked.
  • Triggered automatically -- Because the vendored libraries are declared as build targets in the project's CMake files, the build system's dependency graph ensures they are compiled before any target that depends on them.
  • No separate invocation required -- Developers do not run a separate "install dependencies" step. The standard cmake --build . (or make) command compiles both the vendored dependencies and the project itself.

Theoretical Basis

Static vs. Dynamic Linking

Aspect Static Linking Dynamic Linking
Artefact Code is copied into the final binary at link time Code remains in a separate .so/.dll loaded at runtime
Distribution Single self-contained file Binary + shared libraries must be co-distributed
Startup No runtime symbol resolution overhead Dynamic linker resolves symbols at load time
Updates Dependency update requires relinking Shared library can be replaced without relinking
Binary size Larger (includes all dependency code) Smaller (references external code)

For a project like DuckDB that is distributed as an embeddable library and standalone CLI tool across many platforms, static linking of vendored dependencies is the pragmatic default. It eliminates an entire class of "works on my machine" deployment issues.

Vendoring Trade-offs in Practice

The decision to vendor is driven by the project's priorities:

  1. Portability -- DuckDB targets Linux, macOS, Windows, WebAssembly, and more. Vendoring avoids per-platform packaging of transitive dependencies.
  2. Embeddability -- As an in-process database, DuckDB is linked into other applications. Minimising external dependencies simplifies integration for downstream consumers.
  3. Determinism -- CI/CD pipelines produce bit-identical builds regardless of host package state.

Related

Implementation:Duckdb_Duckdb_Add_Third_Party

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment