Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Duckdb Duckdb Interpreted Benchmark

From Leeroopedia


Overview

Interpreted_Benchmark is the concrete tool for parsing and executing declarative .benchmark files provided by DuckDB's benchmark framework. It is a subclass of Benchmark that reads a plain-text benchmark definition file, interprets its DSL keywords, and translates them into the standard benchmark lifecycle operations (initialize, run, verify, cleanup). This allows benchmark authors to define SQL-based performance tests without writing any C++ code.

Code Reference

InterpretedBenchmark Class

Source: benchmark/include/interpreted_benchmark.hpp:L34-120

class InterpretedBenchmark : public Benchmark {
public:
    InterpretedBenchmark(string full_path);

    void LoadBenchmark();

    unique_ptr<BenchmarkState> Initialize(BenchmarkConfiguration &config) override;
    void Run(BenchmarkState *state) override;
    void Cleanup(BenchmarkState *state) override;
    string Verify(BenchmarkState *state) override;
    void Interrupt(BenchmarkState *state) override;
    string BenchmarkInfo() override;
    string GetLogOutput(BenchmarkState *state) override;
    bool RequireReinit() override;
};

Key methods:

  • InterpretedBenchmark(string full_path) -- Constructor that takes the path to a .benchmark file. The benchmark self-registers with the global registry using the file path as its name.
  • LoadBenchmark() -- Parses the .benchmark file and populates internal data structures for each DSL section.
  • Initialize() -- Executes the load SQL statements to set up the database.
  • Run() -- Executes the run SQL query (the workload being measured).
  • Verify() -- Compares the query result against the declared result section.
  • Cleanup() -- Resets the benchmark state between runs.
  • RequireReinit() -- Returns true if the benchmark declares require_reinit, meaning the database must be fully reinitialized between runs.

Implementation File

Source: benchmark/interpreted_benchmark.cpp:L1-829

This file contains the full DSL parser and lifecycle execution logic, spanning approximately 829 lines. It handles all supported keywords, template expansion, and result verification.

DSL Keywords

The .benchmark file format supports the following keywords:

Keyword Description
name Unique name for the benchmark.
group Logical group for categorization (e.g., tpch, micro).
subgroup Optional sub-categorization within a group.
load SQL statements executed during initialization to set up schema and data.
run The SQL query to benchmark (the workload under measurement).
result Expected query output for verification. The runner compares actual output against this.
template References a template file for parameterized benchmarks.
cache Specifies caching behavior for loaded data across runs.
require Declares a precondition (e.g., a required extension) that must be met.
require_reinit Forces full database reinitialization between each run.

BenchmarkConfiguration

The execution of interpreted benchmarks is controlled by BenchmarkConfiguration:

Field Type Default Description
name_pattern string "" Pattern to filter benchmarks by name.
timeout_duration optional_idx 30 Maximum time in seconds for each benchmark run.
profile_info BenchmarkProfileInfo NONE Controls profiling output.
meta BenchmarkMetaType NONE Controls meta-information output.

I/O Contract

Inputs

  • .benchmark files -- Plain-text files written in the benchmark DSL format. Each file defines one benchmark with its name, group, initialization SQL, workload query, and expected result.
  • BenchmarkConfiguration -- Runtime configuration specifying timeout, filtering pattern, and profiling options.

Outputs

  • Timing results -- Wall-clock execution time for each run of the run query, typically repeated for NRuns() iterations (default 5).
  • Verification pass/fail -- A string result from Verify(): empty string indicates success, non-empty string contains the verification error message describing the mismatch.

Usage Examples

Sample .benchmark File

name aggregate_few_groups
group micro
subgroup aggregate

load
CREATE TABLE integers AS SELECT i % 10 AS grp, i AS value
FROM range(10000000) t(i);

run
SELECT grp, SUM(value) FROM integers GROUP BY grp;

result IIII
0	49999995
1	50000000
2	50000005
3	50000010
4	50000015
5	49999970
6	49999975
7	49999980
8	49999985
9	49999990

Running a Single Benchmark

# Run a specific benchmark by name
build/release/benchmark/benchmark_runner "benchmark/micro/aggregate/aggregate_few_groups.benchmark"

Running All Benchmarks in a Group

# Run all benchmarks in the "micro" group
build/release/benchmark/benchmark_runner "benchmark/micro/.*"

Running with a Custom Timeout

# Run with a 60-second timeout per benchmark
build/release/benchmark/benchmark_runner --timeout=60 "benchmark/tpch/.*"

Related

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment