Principle:Apache Shardingsphere Metadata Repository Persistence
| Knowledge Sources | |
|---|---|
| Domains | Metadata_Management, DDL_Processing |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
After metadata is reloaded from the actual database, it must be persisted to a shared cluster repository so that change events propagate to all nodes and the metadata survives process restarts.
Description
The Metadata Repository Persistence principle governs how refreshed metadata transitions from a transient in-memory state to a durable, version-tracked representation in the cluster's governance repository (typically ZooKeeper, etcd, or a similar distributed coordination service).
This persistence layer serves two critical purposes:
- Durability: Metadata changes survive node restarts and failures. When a node starts or recovers, it reads the current metadata state from the repository.
- Change propagation: Writing to the repository generates change events (e.g., ZooKeeper watches) that notify other cluster nodes to update their in-memory metadata. This is the mechanism by which DDL changes on one node propagate to all nodes in the cluster.
The persistence facade organizes metadata into a hierarchical structure:
- Database level: Database existence and metadata
- Schema level: Schema creation, deletion, and renaming
- Table level: Individual table metadata (columns, indexes, constraints)
- View level: View definitions
The facade supports version tracking when schema persistence is enabled (cluster mode). Each metadata write creates a new version, allowing the system to detect and propagate incremental changes. In standalone mode (persistence disabled), a simpler storage mechanism is used without versioning.
Key operations include:
- persistReloadDatabase: Compares reloaded database schema against current state, persisting additions and removals differentially using GenericSchemaManager.
- persistAlteredTables: Rebuilds table metadata from the database and persists only the changed tables.
- renameSchema: Handles schema rename by persisting all tables/views under the new name and dropping the old schema.
Usage
Use this principle when:
- Understanding how DDL changes on one proxy node become visible to other cluster nodes.
- Investigating metadata synchronization issues in cluster mode.
- Implementing new persistence operations for additional metadata types.
- Configuring the persistence backend (ZooKeeper, etcd) for governance mode.
Theoretical Basis
The persistence follows a differential update pattern:
function persistReloadDatabase(databaseName, reloadDatabase, currentDatabase):
// Compare reloaded state against current state
droppedSchemas = GenericSchemaManager.getToBeAlteredSchemasWithTablesDropped(
reloadDatabase, currentDatabase)
addedSchemas = GenericSchemaManager.getToBeAlteredSchemasWithTablesAdded(
reloadDatabase, currentDatabase)
// Persist additions (triggers ADDED events for watchers)
for each schema in addedSchemas:
tableService.persist(databaseName, schema.name, schema.allTables)
// Persist removals (triggers DELETED events for watchers)
for each schema in droppedSchemas:
tableService.drop(databaseName, schema.name, schema.allTables)
function persistAlteredTables(databaseName, reloadMetaDataContexts, needReloadTables):
database = reloadMetaDataContexts.getDatabase(databaseName)
material = buildSchemaBuilderMaterial(database)
// Rebuild from database
schemas = GenericSchemaBuilder.build(needReloadTables, database.protocolType, material)
// Persist only changed tables
for each (schemaName, schema) in schemas:
addedTables = GenericSchemaManager.getToBeAddedTables(schema, database.getSchema(schemaName))
tableService.persist(databaseName, schemaName, addedTables)
return result
The persistence backend is abstracted through the PersistRepository SPI:
Repository hierarchy:
PersistRepository (SPI interface)
+-- ZooKeeperRepository (cluster mode)
+-- EtcdRepository (cluster mode)
+-- FileRepository (standalone mode)
Persistence service hierarchy:
DatabaseMetaDataPersistFacade
+-- DatabaseMetaDataPersistService (database-level operations)
+-- SchemaMetaDataPersistService (schema-level operations)
+-- TableMetaDataPersistService (table-level operations)
| +-- TableMetaDataPersistEnabledService (versioned, cluster mode)
| +-- TableMetaDataPersistDisabledService (unversioned, standalone mode)
+-- ViewMetaDataPersistService (view-level operations)
Key design decisions:
- Differential persistence: Only changed tables are written, minimizing I/O to the coordination service.
- Version tracking: Each write in cluster mode increments a version counter, enabling precise change detection.
- Event-driven propagation: Repository writes generate watch events that other nodes consume to update their in-memory state.
- Mode-aware service selection: The facade selects enabled or disabled table persistence based on whether cluster mode is active.