Implementation:Duckdb Duckdb Generate Enum Util
Overview
Concrete tool for generating enum utility functions (to-string, from-string conversions) from C++ enum definitions. Two scripts collaborate: generate_enum_util.py scans C++ headers for enum class declarations, while generate_enums.py processes JSON-defined enums for extensions.
Code Reference
| Field | Value |
|---|---|
| Source (enum_util) | scripts/generate_enum_util.py (lines 1--272)
|
| Source (enums) | scripts/generate_enums.py (lines 1--161)
|
| Language | Python 3 |
| API | python3 scripts/generate_enum_util.py, python3 scripts/generate_enums.py
|
| External Dependencies | python3 (no third-party packages required)
|
I/O Contract
generate_enum_util.py
Inputs
The script walks the entire src/ directory tree, scanning all .hpp files for enum class declarations. It uses the regex pattern:
re.finditer(r"enum class (\w*)\s*:\s*(\w*)\s*{((?:\s*[^}])*)}", text, re.MULTILINE)
This captures:
- Group 1: Enum name
- Group 2: Underlying type (e.g.,
uint8_t) - Group 3: Enum body with all members
| Input | Description |
|---|---|
src/include/duckdb/common/enums/*.hpp |
Primary location of enum definitions |
src/**/*.hpp (all headers) |
Any header under src/ containing enum class declarations
|
Outputs
| Output File | Description |
|---|---|
src/include/duckdb/common/enum_util.hpp |
Header with EnumUtil struct, forward declarations of all enums, and template specialization declarations
|
src/common/enum_util.cpp |
Source file with EnumUtil::ToChars and EnumUtil::FromString template specialization implementations
|
Configuration
| Config | Purpose |
|---|---|
blacklist |
List of enum names to skip (e.g., RegexOptions, Flags, ContainerType, Type, DictionaryAppendState, DictFSSTMode, ComplexJSONType, UnavailableReason)
|
overrides |
Dictionary mapping enum names to member-level string overrides (e.g., LogicalTypeId::SQLNULL maps to "NULL", CompressionType::COMPRESSION_AUTO maps to "AUTO")
|
Generated Code Pattern
For each discovered enum, the script generates:
// In enum_util.hpp:
template<>
const char* EnumUtil::ToChars<MyEnum>(MyEnum value);
template<>
MyEnum EnumUtil::FromString<MyEnum>(const char *value);
// In enum_util.cpp:
const StringUtil::EnumStringLiteral *GetMyEnumValues() {
static constexpr StringUtil::EnumStringLiteral values[] {
{ static_cast<uint32_t>(MyEnum::MEMBER_A), "MEMBER_A" },
{ static_cast<uint32_t>(MyEnum::MEMBER_B), "MEMBER_B" }
};
return values;
}
template<>
const char* EnumUtil::ToChars<MyEnum>(MyEnum value) {
return StringUtil::EnumToString(GetMyEnumValues(), 2, "MyEnum",
static_cast<uint32_t>(value));
}
template<>
MyEnum EnumUtil::FromString<MyEnum>(const char *value) {
return static_cast<MyEnum>(
StringUtil::StringToEnum(GetMyEnumValues(), 2, "MyEnum", value));
}
generate_enums.py
Inputs
JSON files with _enums.json suffix defining enum classes for extensions:
[
{
"name": "MyExtEnum",
"values": ["VALUE_A", "VALUE_B", "VALUE_C"],
"includes": ["duckdb/common/constants.hpp"]
}
]
Currently configured for extension/json/include/*_enums.json.
Outputs
| Output | Description |
|---|---|
extension/json/include/*_enums.hpp |
Generated enum class header with EnumUtil specialization declarations
|
extension/json/*_enums.cpp |
Generated source with switch-based ToChars and if-chain FromString implementations
|
Usage Examples
Generate enum utilities from the repository root:
# Generate enum_util.hpp and enum_util.cpp from C++ header scanning
python3 scripts/generate_enum_util.py
# Generate extension enum files from JSON definitions
python3 scripts/generate_enums.py
Typical Workflow
- Add a new member to an existing
enum classin a.hppheader - Run
python3 scripts/generate_enum_util.pyfrom thescripts/directory - The new member automatically appears in the generated
EnumUtil::ToCharsandEnumUtil::FromStringspecializations - If the string representation must differ from the member name, add an override entry