Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Duckdb Duckdb Function Registration Generation

From Leeroopedia


Overview

This principle covers generating function registration and lookup table code from declarative function definitions. Rather than manually writing C++ registration code for every built-in and extension function, DuckDB uses Python scripts to read declarative JSON-based function definitions and produce the necessary registration infrastructure automatically.

Description

The Function Registration Generation principle governs the auto-generation of C++ function registration code and extension-to-function mapping tables. This encompasses two distinct but related concerns:

  1. Built-in function registration -- A code generator reads function definition JSON files and produces C++ source that registers each function (scalar, aggregate, table, pragma, etc.) with the DuckDB catalog at startup.
  2. Extension function lookup tables -- A separate code generator scans extension source trees, discovers which functions each extension provides, and builds lookup tables that map function names to their owning extension. This enables DuckDB to suggest or auto-load the correct extension when a user invokes an unrecognized function.

By deriving registration code from a single declarative source, this principle ensures that:

  • There is no divergence between the declared set of functions and what is actually registered at runtime.
  • Extension lookup tables are always consistent with the extension source code.
  • Adding a new function requires only updating a JSON definition or adding the function to an extension source tree; the code generator handles the rest.

Usage

Apply this principle when adding new built-in functions or extension functions to the DuckDB codebase. Specifically:

  • When a new scalar, aggregate, table, or pragma function is introduced, its definition should be added to the appropriate JSON input file so that the generator can produce the registration code.
  • When a new extension is created or an existing extension gains new functions, the extension function generator should be re-run to update the lookup tables.
  • During the build process, these generators are invoked as part of the code generation pipeline (steps 5-7) to keep generated artifacts in sync.

Theoretical Basis

This principle draws on several well-established software engineering concepts:

  • Function registries -- A central registry pattern where all available functions are catalogued in one place, enabling uniform discovery and dispatch.
  • Lookup tables -- Static data structures that map keys (function names) to values (extensions, registration entries), generated at build time for efficient runtime access.
  • Declarative function definition -- Instead of imperative registration calls scattered across the codebase, functions are described declaratively in JSON. A generator translates these declarations into the imperative C++ code the engine requires.

Related

Implementation:Duckdb_Duckdb_Generate_Functions

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment