Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Duckdb Duckdb Regression Check

From Leeroopedia


Overview

Regression_Check is the concrete tool for detecting benchmark timing regressions provided by DuckDB's regression checking script. It is a Python script that loads two CSV files containing benchmark timing results (one from a baseline run and one from a candidate run), computes median timings for each benchmark, and reports any benchmarks where the new timing exceeds the old by more than the configured threshold. The script uses the DuckDB Python package internally to perform the comparison via SQL queries.

Code Reference

Source: scripts/regression_check.py:L1-115

The script is approximately 115 lines of Python and performs the following steps:

  1. Parses command-line arguments to obtain paths to old and new CSV files.
  2. Loads both CSV files into an in-memory DuckDB instance.
  3. Computes the median timing for each benchmark in both datasets.
  4. Joins the old and new medians by benchmark name.
  5. Applies the dual-threshold regression criteria.
  6. Outputs a report of any detected regressions.
  7. Exits with code 1 if regressions are found, 0 otherwise.

Regression Threshold Logic

# A benchmark is flagged as a regression when BOTH conditions are true:
# 1. The new median is more than 10% slower than the old median
# 2. The absolute difference exceeds 0.01 seconds

The dual-threshold approach prevents false positives on very fast benchmarks where a small absolute increase (e.g., 0.001s) could appear as a large relative increase (e.g., 50%).

API

python3 scripts/regression_check.py --old <old_csv> --new <new_csv>
Argument Type Description
--old string (file path) Path to the CSV file containing baseline benchmark timing results.
--new string (file path) Path to the CSV file containing candidate (new) benchmark timing results.

External Dependencies

  • Python 3 -- The script requires Python 3.x.
  • DuckDB Python package -- The script uses the duckdb Python module to load CSV files and execute comparison queries. This must be installed (e.g., pip install duckdb).

I/O Contract

Inputs

  • Old CSV file (--old) -- A CSV file containing benchmark timing results from the baseline run. Expected columns include the benchmark name and timing value(s) for each run.
  • New CSV file (--new) -- A CSV file containing benchmark timing results from the candidate run, in the same format as the old CSV.

Outputs

  • Regression report -- Printed to standard output. Lists each benchmark that exceeds the regression threshold, showing the old median, new median, absolute difference, and percentage change.
  • Exit code -- 1 if one or more regressions are detected; 0 if no regressions are found. This enables integration with CI/CD pipelines that gate on exit codes.

Usage Examples

Basic Regression Check

# Run benchmarks on the old and new versions, saving results to CSV
build/old/benchmark/benchmark_runner --out=old_results.csv "benchmark/.*"
build/new/benchmark/benchmark_runner --out=new_results.csv "benchmark/.*"

# Compare the results
python3 scripts/regression_check.py --old old_results.csv --new new_results.csv

Interpreting Output

When regressions are detected, the script outputs a table like:

REGRESSION DETECTED:
  benchmark/micro/join/hash_join_small.benchmark:
    old median: 0.045s
    new median: 0.062s
    difference: +0.017s (+37.8%)

When no regressions are detected:

No regressions detected.

CI/CD Integration

# In a CI pipeline, use the exit code to gate merging
python3 scripts/regression_check.py --old baseline.csv --new candidate.csv
if [ $? -ne 0 ]; then
    echo "Performance regression detected. Blocking merge."
    exit 1
fi

Related

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment