Implementation:Haifengl Smile SQL Constructor
Overview
The SQL Constructor creates an instance of smile.data.SQL, which wraps a DuckDB JDBC connection for in-process analytical queries. The constructor is available in two forms: a no-argument constructor for in-memory databases and a single-argument constructor that takes a file path for persistent databases.
API Signature
public class SQL implements AutoCloseable {
/**
* Constructor of in-memory database.
* @throws SQLException if fail to create an in-memory database.
*/
public SQL() throws SQLException
/**
* Constructor to open or create a persistent database.
* @param path DuckDB file path.
* @throws SQLException if fail to open or create the persistent database.
*/
public SQL(String path) throws SQLException
/**
* Closes the database connection.
* @throws SQLException if fail to close the connection.
*/
@Override
public void close() throws SQLException
}
Source
base/src/main/java/smile/data/SQL.java:L45-56
The SQL class also includes a static initializer block (lines 33-39) that loads the DuckDB JDBC driver via Class.forName("org.duckdb.DuckDBDriver").
Import
import smile.data.SQL;
External Dependencies
| Dependency | Group ID | Artifact ID | Purpose |
|---|---|---|---|
| DuckDB JDBC Driver | org.duckdb | duckdb_jdbc | Provides the in-process OLAP database engine |
The DuckDB driver is loaded via the JDBC DriverManager mechanism. The static initializer block in the SQL class calls Class.forName("org.duckdb.DuckDBDriver") to ensure the driver is registered before any connections are created.
Type
API Doc
Inputs
| Parameter | Type | Required | Description |
|---|---|---|---|
path |
String |
No | File path for a persistent DuckDB database. If omitted (no-arg constructor), an in-memory database is created. |
Outputs
| Return Type | Description |
|---|---|
SQL |
A new SQL instance wrapping a DuckDB JDBC Connection. Implements AutoCloseable for use with try-with-resources.
|
Exceptions
| Exception | Condition |
|---|---|
java.sql.SQLException |
Thrown if the DuckDB driver is not available or the database file cannot be opened/created. |
Internal Mechanism
The constructor delegates to java.sql.DriverManager.getConnection() with the appropriate JDBC URL:
- In-memory:
"jdbc:duckdb:"(empty path after the scheme prefix) - Persistent:
"jdbc:duckdb:" + path(path appended to the scheme prefix)
The resulting java.sql.Connection object is stored as a private final field and used by all subsequent operations (queries, data loading, metadata inspection).
The close() method delegates to Connection.close(), releasing all database resources. Because SQL implements AutoCloseable, it can be used in Java's try-with-resources statement for automatic cleanup.
Usage Examples
In-Memory Database (Ephemeral)
import smile.data.SQL;
import smile.data.DataFrame;
// Create an in-memory SQL database using try-with-resources
try (var db = new SQL()) {
// Load data and run queries
db.csv("sales", "data/sales_2024.csv");
DataFrame result = db.query("SELECT region, SUM(amount) AS total FROM sales GROUP BY region");
System.out.println(result);
}
// Connection is automatically closed here
Persistent Database (Durable)
import smile.data.SQL;
import smile.data.DataFrame;
// Create or open a persistent database file
try (var db = new SQL("/tmp/analytics.duckdb")) {
// Data persists across restarts
db.parquet("transactions", "s3://bucket/transactions/*.parquet");
DataFrame summary = db.query("""
SELECT date_trunc('month', tx_date) AS month,
COUNT(*) AS tx_count,
AVG(amount) AS avg_amount
FROM transactions
GROUP BY month
ORDER BY month
""");
System.out.println(summary);
}
// Database file remains at /tmp/analytics.duckdb for future use
Error Handling
import smile.data.SQL;
import java.sql.SQLException;
try {
var db = new SQL("/readonly/path/analytics.duckdb");
// Use db...
db.close();
} catch (SQLException e) {
System.err.println("Failed to open database: " + e.getMessage());
// Handle missing DuckDB driver or file permission errors
}
Key Behaviors
- The
toString()method returns"SQL(catalog_name)"for named databases or"SQL(memory)"for in-memory databases. - The static initializer runs once per class loading, so the DuckDB driver registration happens only once regardless of how many
SQLinstances are created. - Multiple
SQLinstances can coexist, each with their own independent database. - In-memory databases are fully isolated -- tables created in one instance are not visible in another.
Related
- Principle:Haifengl_Smile_SQL_Database_Connection - The theoretical basis for in-process database connections.
- Implementation:Haifengl_Smile_SQL_Data_Loaders - After constructing a SQL instance, data is loaded using csv(), parquet(), or iceberg().
- Implementation:Haifengl_Smile_SQL_Query_API - The SQL instance is used to execute queries via query(), execute(), and update().
- Environment:Haifengl_Smile_Java_25_Runtime