Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Apache Shardingsphere Compute Node Registration

From Leeroopedia


Knowledge Sources
Domains Cluster_Mode, Distributed_Coordination
Last Updated 2026-02-10 00:00 GMT

Overview

Registering compute nodes in the cluster with unique worker IDs enables distributed key generation and cluster membership tracking in a multi-node database system.

Description

Compute Node Registration is the principle of establishing each ShardingSphere compute node as a recognized member of the cluster by:

  1. Generating a unique worker ID for the node via the distributed coordination repository
  2. Persisting the node's online status as ephemeral data in the repository
  3. Publishing the node's state and labels for discovery by other cluster members

This principle addresses two distinct but related concerns:

Worker ID Generation

In distributed systems that use Snowflake or similar algorithms for distributed key generation, each compute node must have a unique worker ID (an integer in the range 0-1023). The ClusterWorkerIdGenerator implements this by:

  • First checking if the current instance already has a persisted worker ID (e.g., from a previous session).
  • If not, scanning the set of all assigned worker IDs across the cluster, selecting the lowest available ID, and reserving it via an atomic operation in the repository.
  • The reservation uses a two-phase approach: first select a candidate from the available pool, then attempt to exclusively reserve it. If another node races and claims the same ID, the process retries.

Online Registration

The ClusterComputeNodePersistService.registerOnline() method performs three operations:

  • Persists the compute node data (database name, attributes, version) as an ephemeral node in the repository. The ephemeral nature means the node is automatically removed if the session expires (e.g., due to a crash or network partition).
  • Updates the node's state (e.g., OK) as ephemeral data.
  • Persists the node's labels for identification and routing purposes.

Together, these operations make the compute node visible to other nodes in the cluster and enable the cluster to track node health through the coordination service's session management.

Usage

Use this principle during the cluster-mode initialization, after the metadata contexts are created but before event listeners are registered. Worker ID generation is invoked during the compute node instance context initialization, while online registration is invoked during the registerOnline() phase of ClusterContextManagerBuilder.

Theoretical Basis

The Compute Node Registration principle addresses several distributed systems challenges:

1. Distributed Unique ID Allocation

Worker IDs must be globally unique across the cluster. The allocation algorithm uses a pessimistic strategy with retry:

FUNCTION generate(props):
    existingId = loadWorkerId(instanceId)
    IF existingId IS PRESENT:
        RETURN existingId

    REPEAT:
        assignedIds = loadAllAssignedWorkerIds()
        PRECONDITION: assignedIds.size <= MAX_WORKER_ID + 1
        availableIds = [0..MAX_WORKER_ID] - assignedIds
        candidate = availableIds.pollSmallest()
        reserved = reserveWorkerId(candidate, instanceId)
    UNTIL reserved IS PRESENT

    persistWorkerId(instanceId, reserved)
    RETURN reserved

The retry loop handles the race condition where multiple nodes simultaneously attempt to claim the same worker ID. The atomic reservation via ReservationPersistService ensures only one node succeeds.

2. Ephemeral Node Registration

Using ephemeral nodes in the coordination service (ZooKeeper ephemeral znodes or etcd leased keys) provides automatic cleanup of stale node registrations:

FUNCTION registerOnline(computeNodeInstance):
    // Ephemeral: auto-deleted when session expires
    persistEphemeral(onlinePath(instance), serializeNodeData(instance))
    persistEphemeral(statusPath(instanceId), instance.state)
    persistEphemeral(labelPath(instanceId), serialize(instance.labels))

If a compute node crashes or loses its connection to the coordination service, the ephemeral nodes are automatically removed after the session timeout, and other nodes can detect the departure.

3. Cluster Membership Discovery

After registering itself, the node loads all existing online instances from the repository and populates its local cluster instance registry. This provides each node with a view of the current cluster membership:

FUNCTION registerOnline(instanceContext, param, contextManager):
    computeNodeService.registerOnline(instance)
    allInstances = computeNodeService.loadAllInstances()
    instanceContext.clusterInstanceRegistry.addAll(allInstances)

4. Configuration Override Warning

In cluster mode, worker IDs are managed by the cluster coordination mechanism. If a user explicitly configures a worker-id property, the generator logs a warning indicating that the manually configured value is ignored in favor of the system-assigned one.

Related Pages

Implemented By

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment