Implementation:Duckdb Duckdb Regression Test Scripts
Overview
Regression_Test_Scripts are the concrete tools for multi-dimensional regression testing covering storage size, extension size, and Python client performance. DuckDB provides three dedicated Python scripts, each targeting a specific regression dimension. Together, they ensure that code changes do not degrade storage efficiency, inflate binary sizes, or harm client library performance.
Code Reference
Storage Size Regression Test
Source: scripts/regression_test_storage_size.py:L1-87
This script measures the on-disk size of DuckDB database files produced by a standard workload and compares them against a baseline. It detects regressions in the storage layer that cause database files to grow unexpectedly.
# Pseudocode summary of the storage size regression test:
# 1. Create a DuckDB database and run a standard workload (CREATE TABLE, INSERT, etc.)
# 2. Measure the resulting .duckdb file size
# 3. Compare against the known baseline size
# 4. Report regression if the size exceeds the baseline by more than the threshold
Extension Size Regression Test
Source: scripts/regression_test_extension_size.py:L1-80
This script checks the compiled size of DuckDB extension binaries and compares them against baseline measurements. It requires access to S3 storage where extension binaries are published.
# Pseudocode summary of the extension size regression test:
# 1. Download or locate compiled extension binaries (.duckdb_extension files)
# 2. Measure the file size of each extension
# 3. Compare against baseline sizes stored in S3 or locally
# 4. Report regression if any extension exceeds its baseline by more than the threshold
Python Client Regression Test
Source: scripts/regression_test_python.py:L1-402
This is the most comprehensive of the three scripts, spanning approximately 402 lines. It runs a suite of benchmarks through the DuckDB Python client API, measuring the performance of data ingestion, query execution, and result retrieval through the Python binding layer.
# Pseudocode summary of the Python client regression test:
# 1. Import duckdb Python package
# 2. Execute a suite of benchmark operations via the Python API
# - Table creation, data loading, query execution, result fetching
# 3. Measure wall-clock time for each operation
# 4. Compare against baseline timings
# 5. Report regressions exceeding the configured threshold
API
Storage Size Regression Test
python3 scripts/regression_test_storage_size.py
Extension Size Regression Test
python3 scripts/regression_test_extension_size.py
Python Client Regression Test
python3 scripts/regression_test_python.py
External Dependencies
| Dependency | Required By | Description |
|---|---|---|
| Python 3 | All scripts | Runtime environment for executing the scripts. |
| duckdb (Python package) | All scripts | Used both as the system under test (Python client regression) and as a utility for data analysis. |
| boto3 | regression_test_extension_size.py |
AWS SDK for Python, used to access S3 storage where extension binaries and baseline measurements are stored. |
I/O Contract
Inputs
| Script | Inputs |
|---|---|
regression_test_storage_size.py |
Built DuckDB binary (or Python package), baseline storage size measurements. |
regression_test_extension_size.py |
Compiled extension binaries, baseline extension sizes, S3 credentials for accessing extension storage. |
regression_test_python.py |
DuckDB Python package (installed or built locally), baseline timing measurements. |
Outputs
All three scripts produce:
- Regression pass/fail per metric -- Each script outputs whether each measured metric (file size, binary size, or timing) passes or fails the regression check.
- Exit code --
0if all metrics pass; non-zero if one or more regressions are detected. - Console output -- Detailed report showing measured values, baseline values, and the percentage difference for each metric.
Usage Examples
Running All Regression Tests in CI
# Run storage size regression test
python3 scripts/regression_test_storage_size.py
STORAGE_RESULT=$?
# Run extension size regression test (requires S3 credentials)
python3 scripts/regression_test_extension_size.py
EXTENSION_RESULT=$?
# Run Python client regression test
python3 scripts/regression_test_python.py
PYTHON_RESULT=$?
# Aggregate results
if [ $STORAGE_RESULT -ne 0 ] || [ $EXTENSION_RESULT -ne 0 ] || [ $PYTHON_RESULT -ne 0 ]; then
echo "One or more regression tests failed."
exit 1
fi
echo "All regression tests passed."
Running a Single Regression Test Locally
# Test storage size regression only
pip install duckdb
python3 scripts/regression_test_storage_size.py
Interpreting Output
Sample output from the storage size regression test:
Storage size regression test:
Baseline: 4.2 MB
Current: 4.3 MB
Change: +2.4%
Status: PASS
Sample output from the extension size regression test when a regression is found:
Extension size regression test:
Extension: httpfs.duckdb_extension
Baseline: 1.8 MB
Current: 2.4 MB
Change: +33.3%
Status: FAIL - REGRESSION DETECTED