Principle:Duckdb Duckdb Package Validation
Overview
Verifying that packaged source files compile independently without the original build system. Package validation is the practice of confirming that the amalgamated source distribution is self-contained and can be compiled by an end user who has only a C++ compiler and no access to the DuckDB repository or build system.
Description
After the amalgamation process produces duckdb.cpp and duckdb.hpp, there is no guarantee that these files are correct. The amalgamation script may have:
- Missed an include -- a header that was needed but not resolved during amalgamation
- Misordered includes -- headers placed before their dependencies
- Included incompatible code -- conditional compilation macros that resolve differently in the amalgamated context
- Generated invalid C++ -- syntax errors introduced by concatenation or macro conflicts
Package validation addresses these risks by performing an independent compilation test: the amalgamated source is compiled using a standalone C++ compiler with no access to the original source tree. If the compilation succeeds, the package is valid. If it fails, the amalgamation process has a defect that must be fixed before distribution.
What Validation Tests
The validation process checks:
- Syntactic correctness -- the amalgamated source is valid C++17
- Include completeness -- all required headers are present within the amalgamation
- Symbol resolution -- all referenced symbols are defined within the amalgamation (at the compilation stage; linking is not tested)
- Macro consistency -- preprocessor macros do not conflict when all source is in a single translation unit
What Validation Does Not Test
- Linking -- the test uses
-S(compile to assembly) or-c(compile to object), not full linking - Runtime correctness -- the test does not execute the compiled code
- Performance -- no benchmarking is performed
Usage
This principle applies in the following scenarios:
- After creating amalgamated source -- as a mandatory gate before any distribution step
- Before distribution -- no package should be released without passing compilation validation
- In CI pipelines -- automated compilation tests run after every amalgamation to catch regressions
- During development -- when modifying the amalgamation script itself, validation ensures changes are correct
# Typical validation workflow:
python3 scripts/amalgamation.py
python3 scripts/test_compile.py
# If test_compile.py exits with 0, the amalgamation is valid.
# If it exits non-zero, there is a defect in the amalgamation.
Theoretical Basis
| Concept | Description |
|---|---|
| Independent Compilation Verification | Testing that a source artifact compiles without external dependencies, confirming the artifact is self-contained. This is analogous to building a package in a clean chroot or Docker container. |
| Self-Contained Artifact Validation | Ensuring that a distributable artifact contains everything needed to be used by a consumer. In source distribution, this means all headers, all source, and no unresolved references. |
| Compilation as a Correctness Check | Using the C++ compiler's type checker and syntax analysis as a lightweight correctness verification tool. While compilation does not prove runtime correctness, it does prove structural integrity of the source. |