Principle:Duckdb Duckdb C API Header Generation
Overview
Generating stable C API headers from machine-readable function specifications. Rather than hand-writing C header files that define a public API surface, functions are declared in structured JSON definition files and a code generator produces the final headers. This ensures consistency across multiple output formats (main C header, extension header, Go extension header, internal extension API).
Description
The practice of generating C API bindings from JSON definitions rather than hand-writing headers. This approach ensures consistency, handles deprecation marking, versioning, and extension API generation from a single source of truth.
In the DuckDB project, every public C API function is declared in a JSON file that specifies:
- The function name, return type, and parameters (with types and names)
- A group classification (e.g.,
open_connect,query_execution,appender) - Deprecation status at the group level
- Comment blocks including parameter descriptions and return value documentation
A Python code generator reads all of these JSON definitions, validates them for duplicates and completeness, and produces multiple output headers:
- Main C header (
duckdb.h) -- the unified public C API for linking with DuckDB - Extension C header (
duckdb_extension.h) -- the API surface available to C extensions - Go extension header (
duckdb_go_extension.h) -- the API surface for Go-based extensions - Internal extension API (
extension_api.hpp) -- an internal C++ header for extension API dispatch
The generator also manages:
- Function ordering to maintain a stable ordering for minimal diff in reviews
- Extension API versioning with stable and unstable version tags
- Exclusion lists for functions intentionally omitted from the extension API
- Doxygen-style comments generated from the JSON comment fields
Usage
This principle applies when maintaining a stable C API surface that needs to be consistent across multiple output formats. Specific scenarios include:
- Adding a new C API function -- add a JSON entry to the appropriate group file; the generator produces the header declaration, comment block, and extension struct entry
- Deprecating a function -- mark the group or function as deprecated in JSON; the generator adds appropriate deprecation annotations
- Adding a new extension API version -- create a new version JSON under the
apis/v1/directory; the generator integrates it into the extension struct with proper versioning - Auditing the public API -- the JSON definitions serve as a machine-readable manifest of every public symbol
Theoretical Basis
- Code generation from schemas -- a declarative specification (JSON) drives the production of multiple output artifacts, eliminating drift between them
- API surface management -- a single canonical source for every public symbol enables automated validation (no duplicates, no missing comments, consistent signatures)
- FFI boundary definition -- C headers serve as the Foreign Function Interface for all language bindings; generating them from a common source ensures that the C, Go, and extension APIs remain synchronized
- Deprecation lifecycle -- versioned API definitions with stable/unstable tags support incremental migration without breaking existing consumers