Implementation:Duckdb Duckdb Interpreted Benchmark

Overview

Interpreted_Benchmark is the concrete tool for parsing and executing declarative .benchmark files provided by DuckDB's benchmark framework. It is a subclass of Benchmark that reads a plain-text benchmark definition file, interprets its DSL keywords, and translates them into the standard benchmark lifecycle operations (initialize, run, verify, cleanup). This allows benchmark authors to define SQL-based performance tests without writing any C++ code.

Code Reference

InterpretedBenchmark Class

Source: benchmark/include/interpreted_benchmark.hpp:L34-120

class InterpretedBenchmark : public Benchmark {
public:
    InterpretedBenchmark(string full_path);

    void LoadBenchmark();

    unique_ptr<BenchmarkState> Initialize(BenchmarkConfiguration &config) override;
    void Run(BenchmarkState *state) override;
    void Cleanup(BenchmarkState *state) override;
    string Verify(BenchmarkState *state) override;
    void Interrupt(BenchmarkState *state) override;
    string BenchmarkInfo() override;
    string GetLogOutput(BenchmarkState *state) override;
    bool RequireReinit() override;
};

Key methods:

InterpretedBenchmark(string full_path) -- Constructor that takes the path to a .benchmark file. The benchmark self-registers with the global registry using the file path as its name.
LoadBenchmark() -- Parses the .benchmark file and populates internal data structures for each DSL section.
Initialize() -- Executes the load SQL statements to set up the database.
Run() -- Executes the run SQL query (the workload being measured).
Verify() -- Compares the query result against the declared result section.
Cleanup() -- Resets the benchmark state between runs.
RequireReinit() -- Returns true if the benchmark declares require_reinit, meaning the database must be fully reinitialized between runs.

Implementation File

Source: benchmark/interpreted_benchmark.cpp:L1-829

This file contains the full DSL parser and lifecycle execution logic, spanning approximately 829 lines. It handles all supported keywords, template expansion, and result verification.

DSL Keywords

The .benchmark file format supports the following keywords:

Keyword	Description
`name`	Unique name for the benchmark.
`group`	Logical group for categorization (e.g., `tpch`, `micro`).
`subgroup`	Optional sub-categorization within a group.
`load`	SQL statements executed during initialization to set up schema and data.
`run`	The SQL query to benchmark (the workload under measurement).
`result`	Expected query output for verification. The runner compares actual output against this.
`template`	References a template file for parameterized benchmarks.
`cache`	Specifies caching behavior for loaded data across runs.
`require`	Declares a precondition (e.g., a required extension) that must be met.
`require_reinit`	Forces full database reinitialization between each run.

BenchmarkConfiguration

The execution of interpreted benchmarks is controlled by BenchmarkConfiguration:

Field	Type	Default	Description
`name_pattern`	`string`	`""`	Pattern to filter benchmarks by name.
`timeout_duration`	`optional_idx`	`30`	Maximum time in seconds for each benchmark run.
`profile_info`	`BenchmarkProfileInfo`	`NONE`	Controls profiling output.
`meta`	`BenchmarkMetaType`	`NONE`	Controls meta-information output.

I/O Contract

Inputs

.benchmark files -- Plain-text files written in the benchmark DSL format. Each file defines one benchmark with its name, group, initialization SQL, workload query, and expected result.
BenchmarkConfiguration -- Runtime configuration specifying timeout, filtering pattern, and profiling options.

Outputs

Timing results -- Wall-clock execution time for each run of the run query, typically repeated for NRuns() iterations (default 5).
Verification pass/fail -- A string result from Verify(): empty string indicates success, non-empty string contains the verification error message describing the mismatch.

Usage Examples

Sample .benchmark File

name aggregate_few_groups
group micro
subgroup aggregate

load
CREATE TABLE integers AS SELECT i % 10 AS grp, i AS value
FROM range(10000000) t(i);

run
SELECT grp, SUM(value) FROM integers GROUP BY grp;

result IIII
0	49999995
1	50000000
2	50000005
3	50000010
4	50000015
5	49999970
6	49999975
7	49999980
8	49999985
9	49999990

Running a Single Benchmark

# Run a specific benchmark by name
build/release/benchmark/benchmark_runner "benchmark/micro/aggregate/aggregate_few_groups.benchmark"

Running All Benchmarks in a Group

# Run all benchmarks in the "micro" group
build/release/benchmark/benchmark_runner "benchmark/micro/.*"

Running with a Custom Timeout

# Run with a 60-second timeout per benchmark
build/release/benchmark/benchmark_runner --timeout=60 "benchmark/tpch/.*"

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment