Implementation:MaterializeInc Materialize Query Fitness Module
| Knowledge Sources | |
|---|---|
| Domains | Testing, Query Analysis, Regression Testing |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
The Query Fitness Module evaluates SQL queries for test suitability by verifying that all parts of a query are essential to its result, selecting only queries where any modification changes the output.
Description
This module implements a fitness function framework for selecting high-quality SQL queries for inclusion in regression test suites. The core class AllPartsEssential extends FitnessFunction and works by systematically commenting out contiguous token ranges of a query using SQL block comments (/* ... */) and checking whether the result changes. If any part can be removed without affecting the result, the query has non-essential constructs and receives a fitness score of 0. Only queries where every part is essential (fitness = 1) are suitable for regression testing, since losing any part during optimization or execution would produce a detectably different result. The companion pick.py module reads queries from stdin, filters them through the fitness function, and outputs SLT (SQL Logic Test) formatted test cases.
Usage
Use this module to filter candidate SQL queries for regression test inclusion. Pipe queries through pick.py to automatically select those where all parts are essential and format them as SLT test cases with EXPLAIN plans.
Code Reference
Source Location
- Repository: MaterializeInc_Materialize
- File: misc/python/materialize/query_fitness/all_parts_essential.py
- File: misc/python/materialize/query_fitness/pick.py
Signature
# all_parts_essential.py
class AllPartsEssential(FitnessFunction):
def _result_checksum(self, query: str) -> str | None: ...
def fitness(self, query: str) -> float: ...
# pick.py
threshold = 0.5
def main() -> None: ...
def dump_slt(conn: pg8000.Connection, query: str) -> None: ...
Import
from materialize.query_fitness.all_parts_essential import AllPartsEssential
from materialize.query_fitness.pick import main, dump_slt
I/O Contract
| Input | Type | Description |
|---|---|---|
| query | str |
SQL query string to evaluate for test fitness |
| conn | pg8000.Connection |
Database connection for executing queries |
| stdin | text stream | Stream of candidate SQL queries (one per line) for pick.py
|
| Output | Type | Description |
|---|---|---|
| fitness | float |
1.0 if all parts essential, 0.0 if any part is non-essential |
| SLT output | stdout | SQL Logic Test formatted test cases with query, expected row counts, and EXPLAIN plans |
Usage Examples
import pg8000
from materialize.query_fitness.all_parts_essential import AllPartsEssential
conn = pg8000.connect(database="materialize", password="materialize")
fitness_func = AllPartsEssential(conn=conn)
# Evaluate a query
score = fitness_func.fitness("SELECT a, b FROM t WHERE a > 10 AND b < 20")
if score > 0.5:
print("Query suitable for regression testing")
# Filter queries from stdin and output SLT test cases
cat candidate_queries.txt | python -m materialize.query_fitness.pick