Implementation:Duckdb Duckdb Test Compile Py
Appearance
Overview
Concrete tool for verifying that DuckDB's amalgamated source compiles independently. The test_compile.py script invokes a C++ compiler on the amalgamated source files to confirm they are syntactically correct, include-complete, and free of symbol resolution errors -- all without the original DuckDB build system.
Code Reference
- Source Location
scripts/test_compile.py(lines 1--86)
Key Functions
| Function | Signature | Purpose |
|---|---|---|
get_git_hash |
get_git_hash() |
Returns the current git HEAD commit hash. Used to determine whether the compilation cache is still valid. |
try_compilation |
try_compilation(fpath, cache) |
Attempts to compile a single source file using clang++ -std=c++17 -S -O0. Records success in the cache. Returns True on success, False on failure.
|
compile_dir |
compile_dir(dir, cache) |
Recursively walks a directory and compiles every .cpp file found. Skips files already in the cache (when resuming).
|
Compilation Command
The script uses the following compilation command internally:
clang++ -std=c++17 -S -O0 -o /dev/null <source_file>
-std=c++17-- DuckDB requires C++17-S-- compile to assembly only (no object file, no linking)-O0-- no optimization (fastest compilation for validation purposes)-o /dev/null-- discard the assembly output
Caching Mechanism
The script uses Python's pickle module to maintain a compilation cache (amalgamation.cache). This cache stores:
- The git commit hash at the time of caching
- A set of file paths that compiled successfully
The cache has three resume modes:
| Mode | Constant | Behavior |
|---|---|---|
| Auto | RESUME_AUTO |
Resume from cache only if the current git hash matches the cached hash. Otherwise, start fresh. |
| Always | RESUME_ALWAYS |
Always resume from cache, regardless of git hash. Useful during iterative development. |
| Never | RESUME_NEVER |
Ignore any existing cache and recompile everything from scratch. |
I/O Contract
Command-Line Interface
python3 scripts/test_compile.py [OPTIONS]
Options:
--resume Use RESUME_ALWAYS mode (resume from cache regardless of commit)
--restart Use RESUME_NEVER mode (ignore cache, recompile everything)
Default (no flags): RESUME_AUTO mode
External Dependencies
| Dependency | Version | Purpose |
|---|---|---|
python3 |
3.7+ | Script runtime |
clang++ or g++ |
C++17 support required | Compilation verification |
Inputs
- Amalgamated source:
src/amalgamation/duckdb.cpp - Amalgamated header:
src/amalgamation/duckdb.hpp - Cache file (optional):
amalgamation.cache(pickle format)
Outputs
| Output | Description |
|---|---|
| Exit code 0 | All files compiled successfully; the amalgamation is valid. |
| Exit code non-zero | One or more files failed to compile; error messages are printed to stderr. |
amalgamation.cache |
Updated cache file recording which files compiled successfully. |
Usage Examples
Basic Validation
# After creating the amalgamation, validate it:
python3 scripts/amalgamation.py
python3 scripts/test_compile.py
echo "Exit code: $?"
# 0 = success, non-zero = failure
Force Full Recompilation
# Ignore any cached results and recompile everything
python3 scripts/test_compile.py --restart
Resume from Cache (Iterative Development)
# When iterating on amalgamation fixes, resume from where you left off
# (even if the commit hash has changed)
python3 scripts/test_compile.py --resume
CI Pipeline Integration
#!/usr/bin/env bash
set -euo pipefail
# Full validation pipeline in CI
python3 scripts/amalgamation.py --extended
python3 scripts/test_compile.py --restart
if [ $? -eq 0 ]; then
echo "Amalgamation validation PASSED"
else
echo "Amalgamation validation FAILED" >&2
exit 1
fi
Using g++ Instead of clang++
# The script defaults to clang++. To use g++, set the CXX environment variable
# (if the script supports it) or modify the script:
CXX=g++ python3 scripts/test_compile.py
Related
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment