Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Apache Shardingsphere Metadata Repository Persistence

From Leeroopedia
Revision as of 17:22, 16 February 2026 by Admin (talk | contribs) (Auto-imported from principles/Apache_Shardingsphere_Metadata_Repository_Persistence.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Metadata_Management, DDL_Processing
Last Updated 2026-02-10 00:00 GMT

Overview

After metadata is reloaded from the actual database, it must be persisted to a shared cluster repository so that change events propagate to all nodes and the metadata survives process restarts.

Description

The Metadata Repository Persistence principle governs how refreshed metadata transitions from a transient in-memory state to a durable, version-tracked representation in the cluster's governance repository (typically ZooKeeper, etcd, or a similar distributed coordination service).

This persistence layer serves two critical purposes:

  1. Durability: Metadata changes survive node restarts and failures. When a node starts or recovers, it reads the current metadata state from the repository.
  2. Change propagation: Writing to the repository generates change events (e.g., ZooKeeper watches) that notify other cluster nodes to update their in-memory metadata. This is the mechanism by which DDL changes on one node propagate to all nodes in the cluster.

The persistence facade organizes metadata into a hierarchical structure:

  • Database level: Database existence and metadata
  • Schema level: Schema creation, deletion, and renaming
  • Table level: Individual table metadata (columns, indexes, constraints)
  • View level: View definitions

The facade supports version tracking when schema persistence is enabled (cluster mode). Each metadata write creates a new version, allowing the system to detect and propagate incremental changes. In standalone mode (persistence disabled), a simpler storage mechanism is used without versioning.

Key operations include:

  • persistReloadDatabase: Compares reloaded database schema against current state, persisting additions and removals differentially using GenericSchemaManager.
  • persistAlteredTables: Rebuilds table metadata from the database and persists only the changed tables.
  • renameSchema: Handles schema rename by persisting all tables/views under the new name and dropping the old schema.

Usage

Use this principle when:

  • Understanding how DDL changes on one proxy node become visible to other cluster nodes.
  • Investigating metadata synchronization issues in cluster mode.
  • Implementing new persistence operations for additional metadata types.
  • Configuring the persistence backend (ZooKeeper, etcd) for governance mode.

Theoretical Basis

The persistence follows a differential update pattern:

function persistReloadDatabase(databaseName, reloadDatabase, currentDatabase):
    // Compare reloaded state against current state
    droppedSchemas = GenericSchemaManager.getToBeAlteredSchemasWithTablesDropped(
        reloadDatabase, currentDatabase)
    addedSchemas = GenericSchemaManager.getToBeAlteredSchemasWithTablesAdded(
        reloadDatabase, currentDatabase)

    // Persist additions (triggers ADDED events for watchers)
    for each schema in addedSchemas:
        tableService.persist(databaseName, schema.name, schema.allTables)

    // Persist removals (triggers DELETED events for watchers)
    for each schema in droppedSchemas:
        tableService.drop(databaseName, schema.name, schema.allTables)

function persistAlteredTables(databaseName, reloadMetaDataContexts, needReloadTables):
    database = reloadMetaDataContexts.getDatabase(databaseName)
    material = buildSchemaBuilderMaterial(database)

    // Rebuild from database
    schemas = GenericSchemaBuilder.build(needReloadTables, database.protocolType, material)

    // Persist only changed tables
    for each (schemaName, schema) in schemas:
        addedTables = GenericSchemaManager.getToBeAddedTables(schema, database.getSchema(schemaName))
        tableService.persist(databaseName, schemaName, addedTables)

    return result

The persistence backend is abstracted through the PersistRepository SPI:

Repository hierarchy:
  PersistRepository (SPI interface)
    +-- ZooKeeperRepository (cluster mode)
    +-- EtcdRepository (cluster mode)
    +-- FileRepository (standalone mode)

Persistence service hierarchy:
  DatabaseMetaDataPersistFacade
    +-- DatabaseMetaDataPersistService (database-level operations)
    +-- SchemaMetaDataPersistService (schema-level operations)
    +-- TableMetaDataPersistService (table-level operations)
    |     +-- TableMetaDataPersistEnabledService (versioned, cluster mode)
    |     +-- TableMetaDataPersistDisabledService (unversioned, standalone mode)
    +-- ViewMetaDataPersistService (view-level operations)

Key design decisions:

  • Differential persistence: Only changed tables are written, minimizing I/O to the coordination service.
  • Version tracking: Each write in cluster mode increments a version counter, enabling precise change detection.
  • Event-driven propagation: Repository writes generate watch events that other nodes consume to update their in-memory state.
  • Mode-aware service selection: The facade selects enabled or disabled table persistence based on whether cluster mode is active.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment