Implementation:Duckdb Duckdb Interpreted Benchmark
Overview
Interpreted_Benchmark is the concrete tool for parsing and executing declarative .benchmark files provided by DuckDB's benchmark framework. It is a subclass of Benchmark that reads a plain-text benchmark definition file, interprets its DSL keywords, and translates them into the standard benchmark lifecycle operations (initialize, run, verify, cleanup). This allows benchmark authors to define SQL-based performance tests without writing any C++ code.
Code Reference
InterpretedBenchmark Class
Source: benchmark/include/interpreted_benchmark.hpp:L34-120
class InterpretedBenchmark : public Benchmark {
public:
InterpretedBenchmark(string full_path);
void LoadBenchmark();
unique_ptr<BenchmarkState> Initialize(BenchmarkConfiguration &config) override;
void Run(BenchmarkState *state) override;
void Cleanup(BenchmarkState *state) override;
string Verify(BenchmarkState *state) override;
void Interrupt(BenchmarkState *state) override;
string BenchmarkInfo() override;
string GetLogOutput(BenchmarkState *state) override;
bool RequireReinit() override;
};
Key methods:
InterpretedBenchmark(string full_path)-- Constructor that takes the path to a.benchmarkfile. The benchmark self-registers with the global registry using the file path as its name.LoadBenchmark()-- Parses the.benchmarkfile and populates internal data structures for each DSL section.Initialize()-- Executes theloadSQL statements to set up the database.Run()-- Executes therunSQL query (the workload being measured).Verify()-- Compares the query result against the declaredresultsection.Cleanup()-- Resets the benchmark state between runs.RequireReinit()-- Returnstrueif the benchmark declaresrequire_reinit, meaning the database must be fully reinitialized between runs.
Implementation File
Source: benchmark/interpreted_benchmark.cpp:L1-829
This file contains the full DSL parser and lifecycle execution logic, spanning approximately 829 lines. It handles all supported keywords, template expansion, and result verification.
DSL Keywords
The .benchmark file format supports the following keywords:
| Keyword | Description |
|---|---|
name |
Unique name for the benchmark. |
group |
Logical group for categorization (e.g., tpch, micro).
|
subgroup |
Optional sub-categorization within a group. |
load |
SQL statements executed during initialization to set up schema and data. |
run |
The SQL query to benchmark (the workload under measurement). |
result |
Expected query output for verification. The runner compares actual output against this. |
template |
References a template file for parameterized benchmarks. |
cache |
Specifies caching behavior for loaded data across runs. |
require |
Declares a precondition (e.g., a required extension) that must be met. |
require_reinit |
Forces full database reinitialization between each run. |
BenchmarkConfiguration
The execution of interpreted benchmarks is controlled by BenchmarkConfiguration:
| Field | Type | Default | Description |
|---|---|---|---|
name_pattern |
string |
"" |
Pattern to filter benchmarks by name. |
timeout_duration |
optional_idx |
30 |
Maximum time in seconds for each benchmark run. |
profile_info |
BenchmarkProfileInfo |
NONE |
Controls profiling output. |
meta |
BenchmarkMetaType |
NONE |
Controls meta-information output. |
I/O Contract
Inputs
.benchmarkfiles -- Plain-text files written in the benchmark DSL format. Each file defines one benchmark with its name, group, initialization SQL, workload query, and expected result.- BenchmarkConfiguration -- Runtime configuration specifying timeout, filtering pattern, and profiling options.
Outputs
- Timing results -- Wall-clock execution time for each run of the
runquery, typically repeated forNRuns()iterations (default 5). - Verification pass/fail -- A string result from
Verify(): empty string indicates success, non-empty string contains the verification error message describing the mismatch.
Usage Examples
Sample .benchmark File
name aggregate_few_groups
group micro
subgroup aggregate
load
CREATE TABLE integers AS SELECT i % 10 AS grp, i AS value
FROM range(10000000) t(i);
run
SELECT grp, SUM(value) FROM integers GROUP BY grp;
result IIII
0 49999995
1 50000000
2 50000005
3 50000010
4 50000015
5 49999970
6 49999975
7 49999980
8 49999985
9 49999990
Running a Single Benchmark
# Run a specific benchmark by name
build/release/benchmark/benchmark_runner "benchmark/micro/aggregate/aggregate_few_groups.benchmark"
Running All Benchmarks in a Group
# Run all benchmarks in the "micro" group
build/release/benchmark/benchmark_runner "benchmark/micro/.*"
Running with a Custom Timeout
# Run with a 60-second timeout per benchmark
build/release/benchmark/benchmark_runner --timeout=60 "benchmark/tpch/.*"