Implementation:Duckdb Duckdb Regression Check
Overview
Regression_Check is the concrete tool for detecting benchmark timing regressions provided by DuckDB's regression checking script. It is a Python script that loads two CSV files containing benchmark timing results (one from a baseline run and one from a candidate run), computes median timings for each benchmark, and reports any benchmarks where the new timing exceeds the old by more than the configured threshold. The script uses the DuckDB Python package internally to perform the comparison via SQL queries.
Code Reference
Source: scripts/regression_check.py:L1-115
The script is approximately 115 lines of Python and performs the following steps:
- Parses command-line arguments to obtain paths to old and new CSV files.
- Loads both CSV files into an in-memory DuckDB instance.
- Computes the median timing for each benchmark in both datasets.
- Joins the old and new medians by benchmark name.
- Applies the dual-threshold regression criteria.
- Outputs a report of any detected regressions.
- Exits with code 1 if regressions are found, 0 otherwise.
Regression Threshold Logic
# A benchmark is flagged as a regression when BOTH conditions are true:
# 1. The new median is more than 10% slower than the old median
# 2. The absolute difference exceeds 0.01 seconds
The dual-threshold approach prevents false positives on very fast benchmarks where a small absolute increase (e.g., 0.001s) could appear as a large relative increase (e.g., 50%).
API
python3 scripts/regression_check.py --old <old_csv> --new <new_csv>
| Argument | Type | Description |
|---|---|---|
--old |
string (file path) |
Path to the CSV file containing baseline benchmark timing results. |
--new |
string (file path) |
Path to the CSV file containing candidate (new) benchmark timing results. |
External Dependencies
- Python 3 -- The script requires Python 3.x.
- DuckDB Python package -- The script uses the
duckdbPython module to load CSV files and execute comparison queries. This must be installed (e.g.,pip install duckdb).
I/O Contract
Inputs
- Old CSV file (
--old) -- A CSV file containing benchmark timing results from the baseline run. Expected columns include the benchmark name and timing value(s) for each run. - New CSV file (
--new) -- A CSV file containing benchmark timing results from the candidate run, in the same format as the old CSV.
Outputs
- Regression report -- Printed to standard output. Lists each benchmark that exceeds the regression threshold, showing the old median, new median, absolute difference, and percentage change.
- Exit code --
1if one or more regressions are detected;0if no regressions are found. This enables integration with CI/CD pipelines that gate on exit codes.
Usage Examples
Basic Regression Check
# Run benchmarks on the old and new versions, saving results to CSV
build/old/benchmark/benchmark_runner --out=old_results.csv "benchmark/.*"
build/new/benchmark/benchmark_runner --out=new_results.csv "benchmark/.*"
# Compare the results
python3 scripts/regression_check.py --old old_results.csv --new new_results.csv
Interpreting Output
When regressions are detected, the script outputs a table like:
REGRESSION DETECTED:
benchmark/micro/join/hash_join_small.benchmark:
old median: 0.045s
new median: 0.062s
difference: +0.017s (+37.8%)
When no regressions are detected:
No regressions detected.
CI/CD Integration
# In a CI pipeline, use the exit code to gate merging
python3 scripts/regression_check.py --old baseline.csv --new candidate.csv
if [ $? -ne 0 ]; then
echo "Performance regression detected. Blocking merge."
exit 1
fi