Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Environment:Duckdb Duckdb Code Generation Tools

From Leeroopedia


Knowledge Sources
Domains Infrastructure, Code_Generation
Last Updated 2026-02-07 12:00 GMT

Overview

Python 3 environment with GNU Bison 2.3 and Flex 2.5+ for DuckDB's code generation pipeline (SQL grammar, enums, serialization, C API headers).

Description

This environment provides the tooling required to run DuckDB's code generation scripts. The code generation pipeline consists of approximately 15 Python scripts that auto-generate C++ source files including the SQL parser (via Bison/Flex), enum utilities, serialization code, C API headers, function registration boilerplate, and settings configuration. All generators are Python 3 scripts that read JSON or header definitions and output C++ source files.

Usage

Use this environment when running any of the `generate_*.py` scripts in the `scripts/` directory, or when executing the full code generation pipeline as a prerequisite to source amalgamation. Required before building if generated files are out of date.

System Requirements

Category Requirement Notes
OS Linux, macOS, or Windows Cross-platform Python scripts
Hardware Any No special hardware requirements
Disk 100MB free space For generated output files

Dependencies

System Packages

  • `bison` (GNU Bison) 2.3+ (for SQL grammar generation)
  • `flex` 2.5+ (for lexer/scanner generation)
  • `python3` (for all code generation scripts)

Python Packages

No additional Python packages are required beyond the standard library. All generation scripts use only built-in modules (`os`, `subprocess`, `re`, `sys`, `json`, `struct`).

Credentials

No credentials are required for code generation.

Quick Install

# Ubuntu/Debian
sudo apt-get install -y python3 bison flex

# macOS
brew install python3 bison flex

# Run all generators
python3 scripts/generate_grammar.py
python3 scripts/generate_flex.py
python3 scripts/generate_enum_util.py
python3 scripts/generate_serialization.py
python3 scripts/generate_c_api.py
python3 scripts/generate_functions.py
python3 scripts/generate_settings.py

Code Evidence

Bison version comment from `scripts/generate_grammar.py:1-3`:

# use bison to generate the parser files
# the following version of bison is used:
# bison (GNU Bison) 2.3

Flex invocation from `scripts/generate_flex.py:28-30`:

proc = subprocess.Popen(
    [flex_bin, '--nounistd', '-o', target_file, flex_file_path],
    stdout=subprocess.PIPE, stderr=subprocess.PIPE
)

Flex version comment from `scripts/generate_flex.py:1-3`:

# use flex to generate the scanner file for the parser
# the following version of bison is used:
# flex 2.5.35 Apple(flex-32)

Custom bison path support from `scripts/generate_grammar.py:19-20`:

    if arg.startswith("--bison="):
        bison_location = arg.replace("--bison=", "")

Common Errors

Error Message Cause Solution
`Flex failed` Flex not installed or wrong version Install flex: `sudo apt-get install flex`
`bison: command not found` Bison not installed Install bison: `sudo apt-get install bison`
`FileNotFoundError: python3` Python 3 not installed Install Python 3: `sudo apt-get install python3`
Grammar conflicts in bison output Modified grammar rules with ambiguities Use `--counterexamples` flag with generate_grammar.py for debugging

Compatibility Notes

  • macOS: System-provided bison may be outdated (2.3). Homebrew bison can be used via `--bison=/usr/local/opt/bison/bin/bison` argument.
  • Windows: Bison and flex are typically installed via MSYS2 or WSL.
  • Custom paths: Both bison and flex locations can be overridden via command-line arguments (`--bison=`, `--flex=`).
  • Custom directory prefix: All generators support `--custom_dir_prefix` for out-of-tree builds.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment