Principle:Duckdb Duckdb Source Package Building
Overview
Assembling distributable source packages from amalgamated and extension source files. Source package building is the process of collecting the amalgamated DuckDB source, public headers, extension source files, and version metadata into self-contained zip archives suitable for distribution to downstream consumers.
Description
After amalgamation produces duckdb.cpp and duckdb.hpp, these files must be assembled into distributable packages along with supporting files. The source package building process handles several responsibilities:
Package Contents
A typical source package (libduckdb-src.zip) contains:
| File | Purpose |
|---|---|
duckdb.hpp |
Amalgamated C++ header (all declarations) |
duckdb.cpp |
Amalgamated C++ source (all implementations) |
duckdb.h |
Public C API header |
duckdb_extension.h |
Extension API header for building loadable extensions |
Version Detection
The package building process derives version information from git tags. The version string is computed using git describe, which produces strings such as v0.9.2 or v0.9.2-dev123 (for development builds). This version is embedded in the package metadata and used for naming release artifacts.
Extension Inclusion
The packaging process can optionally include extension source files. Extensions are specified via the --extensions flag as a semicolon-delimited list. Each extension's source and headers are included in the package alongside the core amalgamated files.
Third-Party Dependencies
The package builder identifies third-party include directories and source files that are needed for compilation. These are discovered by scanning CMakeLists.txt files and are included in the package to ensure self-contained compilation.
Usage
This principle applies when:
- Preparing source distributions for release -- creating the
libduckdb-src.zipthat appears on GitHub Releases - Building platform binary packages -- combining compiled libraries with headers for platform-specific distributions
- CI release pipelines -- automated packaging steps that produce artifacts for every release tag
- Extension packaging -- including extension source files in distributable archives
Theoretical Basis
| Concept | Description |
|---|---|
| Reproducible Packaging | Given the same source tree and git state, the packaging process produces identical output. This is achieved by deriving all metadata (version, commit hash) from git and using deterministic file collection. |
| Version Stamping | Embedding version information derived from the version control system (git tags) into the package, ensuring traceability from artifact back to source commit. |
| Source Distribution | The practice of distributing software as source code (rather than binaries), allowing consumers to compile for their target platform. Common in C/C++ ecosystems where ABI compatibility across platforms is not guaranteed. |