Principle:Apache Shardingsphere DDL Metadata Reload
| Knowledge Sources | |
|---|---|
| Domains | Metadata_Management, DDL_Processing |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
After a DDL statement changes the physical database schema, the middleware must reload the affected metadata from the actual data sources rather than attempting to infer the changes from the DDL statement alone.
Description
The DDL Metadata Reload principle addresses the challenge of keeping the in-memory metadata model synchronized with the physical database after DDL operations. Instead of parsing and interpreting the DDL statement to derive the resulting schema structure (which would be error-prone across different database vendors), the system takes a reload-from-source approach: it queries the actual database's information schema (or equivalent metadata tables) to obtain the authoritative, post-DDL metadata.
This reload is performed using GenericSchemaBuilder, which connects to the backend database through the configured storage units and retrieves complete table metadata including columns, indexes, and constraints. The builder uses the rule metadata to understand sharding topology, so it queries the correct physical data source.
Each DDL type has its own reload strategy:
- CREATE TABLE: Builds metadata for the newly created table from the database, then registers single table data nodes if applicable, and persists via createTable().
- ALTER TABLE: Reloads the altered table's metadata. If the ALTER includes a RENAME, the renamed table is loaded and the old table name is dropped.
- DROP TABLE: No reload is needed since the table no longer exists; the metadata is simply removed.
A key aspect is single table awareness: when the affected table is a single table (not distributed across shards), the refresher updates the mutable data node rule attributes to register the table's data source mapping before reloading metadata.
Usage
Use this principle when:
- Implementing a new type-specific DDL metadata refresher.
- Understanding why ShardingSphere queries the database after DDL rather than parsing DDL syntax.
- Troubleshooting metadata inconsistencies after DDL operations.
- Extending the system to support new DDL types (e.g., stored procedures, sequences).
Theoretical Basis
The reload workflow for a representative CREATE TABLE operation:
function refreshCreateTable(database, logicDataSourceName, schemaName,
databaseType, sqlStatement, props):
// Step 1: Extract table name from DDL statement
tableName = getTableName(sqlStatement.getTable(), databaseType)
// Step 2: Handle single table registration
ruleMetaData = clone(database.getRuleMetaData())
if isSingleTable(tableName, database):
for each mutableAttribute in ruleMetaData.getMutableDataNodeAttributes():
mutableAttribute.put(logicDataSourceName, schemaName, tableName)
// Step 3: Reload from actual database (NOT from DDL parsing)
material = new GenericSchemaBuilderMaterial(
database.storageUnits, ruleMetaData.rules, props, schemaName)
schemas = GenericSchemaBuilder.build([tableName], database.protocolType, material)
loadedTable = schemas[schemaName].getTable(tableName)
// Step 4: Persist to metadata repository
persistService.createTable(database, schemaName, loadedTable)
For ALTER TABLE with rename:
function refreshAlterTable(database, logicDataSourceName, schemaName,
databaseType, sqlStatement, props):
tableName = getTableName(sqlStatement.getTable(), databaseType)
if sqlStatement.hasRenameTable():
renameTableName = sqlStatement.getRenameTable().getName()
alteredTables = [loadFromDatabase(renameTableName)]
droppedTables = [tableName] // old name is dropped
else:
alteredTables = [loadFromDatabase(tableName)]
droppedTables = []
persistService.alterTables(database, schemaName, alteredTables)
persistService.dropTables(database, schemaName, droppedTables)
Key design decisions:
- Reload over inference: Querying the database ensures accuracy regardless of vendor-specific DDL syntax.
- Rule-aware loading: The schema builder uses rule metadata to resolve physical data sources from logical names.
- Single table handling: Mutable data node attributes are updated before reload to ensure correct data source routing.