Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Workflow:Duckdb Duckdb Building From Source

From Leeroopedia


Knowledge Sources
Domains Database_Engineering, Build_Systems, C_Plus_Plus
Last Updated 2026-02-07 11:00 GMT

Overview

End-to-end process for compiling the DuckDB analytical database system from source, producing the core library, CLI shell, test runner, and benchmark runner executables.

Description

This workflow covers the complete build process for DuckDB from a clean source checkout. DuckDB uses CMake as its build system with support for cross-platform compilation, unity builds for faster compile times, ccache integration, and configurable sanitizers. The process handles compiler detection, version extraction from git tags, platform-specific settings, third-party dependency compilation, extension selection and linking, and final binary production. The build produces the core DuckDB library (static and/or shared), the interactive CLI shell, the unit test runner, and optionally the benchmark runner.

Usage

Execute this workflow when you need to compile DuckDB from source for development, testing, or custom deployment. This applies when you need a debug build for development, a release build with specific extensions enabled, a custom build with sanitizers for debugging memory or threading issues, or when targeting a specific platform or architecture.

Execution Steps

Step 1: Environment Setup

Ensure the build environment has the required dependencies: CMake (3.x+), Python 3, and a C++11-compliant compiler (GCC, Clang, or MSVC). Optionally install Ninja for faster parallel builds and ccache for accelerating rebuilds. On Ubuntu 18.04, a setup script provisions all required packages including cross-compilation toolchains.

Key considerations:

  • The build system auto-detects and uses ccache if available
  • Ninja can be selected as the generator via the GEN=ninja environment variable
  • CMAKE_BUILD_PARALLEL_LEVEL controls the number of parallel build jobs

Step 2: Configure Extensions

Determine which extensions to include in the build. Extensions can be specified via the DUCKDB_EXTENSIONS environment variable, individual BUILD_<name> flags, or through CMake configuration files. The extension system supports in-tree extensions (bundled in the repo), out-of-tree extensions (from external repositories), and can fetch extensions directly from GitHub URLs.

Key considerations:

  • The base configuration in extension/extension_config.cmake is always loaded
  • Local overrides go in extension/extension_config_local.cmake (gitignored)
  • Extensions can be statically linked or built as loadable binaries only
  • VCPKG integration handles dependencies for extensions that require external libraries

Step 3: CMake Configuration

Run CMake to generate the build files. The root CMakeLists.txt extracts version information from git tags, configures compiler flags, sets up include paths for all third-party libraries, registers extension build targets, and generates the final build system files. Configuration options include build type (Debug/Release/RelAssert), sanitizer selection (address, thread, undefined behavior), and feature flags.

Key considerations:

  • Version is extracted from git describe with pattern vX.Y.Z
  • Unity builds combine multiple source files into single translation units for faster compilation
  • Platform detection determines architecture-specific compiler flags
  • Extension configuration files are loaded in reverse priority order

Step 4: Compile Third-Party Dependencies

Build all vendored third-party libraries as static libraries. These include compression libraries (Brotli, LZ4, Zstd, Miniz, Snappy, FSST), the PostgreSQL SQL parser (libpg_query), cryptography (mbedTLS), text processing (Snowball stemmers, utf8proc, RE2), data structures (HyperLogLog, FastPFor), and formatting ({fmt}).

Key considerations:

  • Third-party libraries are compiled with their own specific compiler flags
  • mbedTLS is configured with a minimal feature set for DuckDB's needs (SHA-256, RSA, AES-GCM)
  • Snowball stemmers cover 30 languages and are machine-generated from grammar specifications

Step 5: Compile Core Library

Build the DuckDB core library from all source subdirectories: parser, planner, optimizer, execution engine, catalog, storage, transaction management, and common utilities. This produces the main duckdb static library and optionally a shared library.

Key considerations:

  • All functions in the core must be in the duckdb namespace
  • The source tree follows a modular architecture: parser transforms SQL text into AST, planner converts AST to logical plans, optimizer rewrites plans, and execution converts logical plans to physical operators
  • Push-based execution model is used for query processing

Step 6: Build Executables

Link the compiled core library with third-party dependencies and selected extensions to produce the final executables: the duckdb CLI shell (tools/), the unittest binary (test/), and optionally the benchmark_runner (benchmark/). Extensions configured for static linking are embedded into these binaries.

Key considerations:

  • The CLI shell provides interactive SQL access
  • The unittest binary uses the Catch2 testing framework
  • The benchmark runner requires BUILD_BENCHMARK=1 and optionally BUILD_TPCH=1
  • Debug builds use make debug, release builds use make

Step 7: Verify Build

Run the fast unit test suite to confirm the build produces correct results. The fast tests complete in approximately one minute, while the full test suite runs for approximately one hour. Tests use both the sqllogictest framework (.test files) and C++ Catch2 tests.

Key considerations:

  • make unit runs the fast test suite
  • make allunit runs the complete test suite
  • Slower tests should be added as .test_slow files
  • Code coverage can be checked using scripts/coverage_check.sh

Execution Diagram

GitHub URL

Workflow Repository