Principle:Apache Shardingsphere Cluster Metadata Synchronization
| Knowledge Sources | |
|---|---|
| Domains | Metadata_Management, DDL_Processing |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
In a multi-node cluster, metadata changes persisted by one node must be propagated to all other nodes through an event-driven mechanism that updates each node's in-memory metadata model.
Description
The Cluster Metadata Synchronization principle addresses the final step in the DDL metadata refresh workflow: ensuring that all nodes in a ShardingSphere cluster have a consistent, up-to-date view of the database schema after a DDL operation. When one proxy node persists metadata changes to the governance repository (e.g., ZooKeeper), the repository generates change events (watches) that are delivered to all other nodes. Each node has registered event handlers that process these changes and update their local in-memory metadata.
The synchronization architecture uses a handler chain organized by metadata scope:
- TableCoordinatorChangedHandler: Handles coordinated table changes propagated through the global coordinator path. This handler processes CREATE and DROP coordination events at the global level.
- TableChangedHandler: Handles per-database table metadata changes by watching the table metadata node path. Reacts to ADDED, UPDATED, and DELETED events.
- ViewChangedHandler: Handles per-database view metadata changes similarly to TableChangedHandler.
- SchemaChangedHandler: Handles schema-level changes (schema creation and deletion).
Each handler follows a consistent pattern:
- Subscribe to a specific node path pattern in the governance repository.
- Parse the event key to extract identifiers (database name, schema name, table/view name).
- Dispatch to the appropriate handler method based on the event type (ADDED/UPDATED vs. DELETED).
- Update the in-memory metadata via ContextManager.getMetaDataContextManager().getDatabaseMetaDataManager().
- Trigger an asynchronous statistics refresh via StatisticsRefreshEngine.asyncRefresh().
The handler system operates at two levels:
- Global handlers (GlobalDataChangedEventHandler): Handle events on global paths (e.g., coordinator table changes). These are singletons that process events for any database.
- Database handlers (DatabaseLeafValueChangedHandler / DatabaseNodeValueChangedHandler): Handle events scoped to a specific database, receiving the database name as a parameter.
Usage
Use this principle when:
- Understanding how DDL changes on one proxy node become visible to queries on other nodes.
- Diagnosing metadata inconsistencies across cluster nodes.
- Implementing new metadata change handlers for additional metadata types.
- Configuring event processing behavior in the cluster dispatch system.
Theoretical Basis
The event-driven synchronization flow:
Node A: DDL Execution
|
+-- Push-Down Refresh
| +-- Reload metadata from database
| +-- Persist to governance repository (ZooKeeper/etcd)
| +-- Write: /metadata/{db}/schemas/{schema}/tables/{table}
| +-- Write: /states/coordinator/tables/{db}.{schema}.{table}/CREATE
|
+-- Repository generates watch events
|
+-- Node B receives DataChangedEvent (ADDED/UPDATED)
| +-- TableChangedHandler.handle(databaseName, event)
| | +-- Load table metadata from repository
| | +-- contextManager.alterTable(db, schema, table)
| | +-- statisticsRefreshEngine.asyncRefresh()
| +-- TableCoordinatorChangedHandler.handle(contextManager, event)
| +-- Deserialize table from event value
| +-- contextManager.alterTable(db, schema, table)
| +-- statisticsRefreshEngine.asyncRefresh()
|
+-- Node C receives the same events (same handling)
The handler dispatch pattern:
// Table changed handler (per-database scope)
function handleTableChanged(databaseName, event):
schemaName = extractFromPath(event.key, "schema")
tableName = extractFromPath(event.key, "table")
switch event.type:
case ADDED, UPDATED:
table = persistService.loadTable(databaseName, schemaName, tableName)
metaDataManager.alterTable(databaseName, schemaName, table)
statisticsEngine.asyncRefresh()
case DELETED:
metaDataManager.dropTable(databaseName, schemaName, tableName)
statisticsEngine.asyncRefresh()
// Table coordinator handler (global scope)
function handleTableCoordinator(contextManager, event):
qualifiedName = extractFromPath(event.key) // "db.schema.table"
coordinatorType = extractCoordinatorType(event.key) // CREATE or DROP
switch coordinatorType:
case CREATE:
table = deserializeFromYAML(event.value)
metaDataManager.alterTable(db, schema, table)
statisticsEngine.asyncRefresh()
case DROP:
metaDataManager.dropTable(db, schema, tableName)
statisticsEngine.asyncRefresh()
Key design decisions:
- Event-driven architecture: Nodes do not poll for changes; the repository pushes events to registered watchers.
- Dual propagation paths: Both per-database table/view paths and global coordinator paths carry change events, providing redundancy and supporting different use cases.
- Idempotent handlers: Handlers use alterTable() (upsert semantics) for both ADDED and UPDATED events, ensuring that repeated or out-of-order events produce correct results.
- Asynchronous statistics refresh: After metadata changes, statistics are refreshed asynchronously to avoid blocking the event processing thread.
- Path-based routing: Events are routed to handlers based on the repository path pattern they subscribe to, enabling fine-grained event filtering.