Principle:Apache Shardingsphere Compute Node Registration
| Knowledge Sources | |
|---|---|
| Domains | Cluster_Mode, Distributed_Coordination |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Registering compute nodes in the cluster with unique worker IDs enables distributed key generation and cluster membership tracking in a multi-node database system.
Description
Compute Node Registration is the principle of establishing each ShardingSphere compute node as a recognized member of the cluster by:
- Generating a unique worker ID for the node via the distributed coordination repository
- Persisting the node's online status as ephemeral data in the repository
- Publishing the node's state and labels for discovery by other cluster members
This principle addresses two distinct but related concerns:
Worker ID Generation
In distributed systems that use Snowflake or similar algorithms for distributed key generation, each compute node must have a unique worker ID (an integer in the range 0-1023). The ClusterWorkerIdGenerator implements this by:
- First checking if the current instance already has a persisted worker ID (e.g., from a previous session).
- If not, scanning the set of all assigned worker IDs across the cluster, selecting the lowest available ID, and reserving it via an atomic operation in the repository.
- The reservation uses a two-phase approach: first select a candidate from the available pool, then attempt to exclusively reserve it. If another node races and claims the same ID, the process retries.
Online Registration
The ClusterComputeNodePersistService.registerOnline() method performs three operations:
- Persists the compute node data (database name, attributes, version) as an ephemeral node in the repository. The ephemeral nature means the node is automatically removed if the session expires (e.g., due to a crash or network partition).
- Updates the node's state (e.g.,
OK) as ephemeral data. - Persists the node's labels for identification and routing purposes.
Together, these operations make the compute node visible to other nodes in the cluster and enable the cluster to track node health through the coordination service's session management.
Usage
Use this principle during the cluster-mode initialization, after the metadata contexts are created but before event listeners are registered. Worker ID generation is invoked during the compute node instance context initialization, while online registration is invoked during the registerOnline() phase of ClusterContextManagerBuilder.
Theoretical Basis
The Compute Node Registration principle addresses several distributed systems challenges:
1. Distributed Unique ID Allocation
Worker IDs must be globally unique across the cluster. The allocation algorithm uses a pessimistic strategy with retry:
FUNCTION generate(props):
existingId = loadWorkerId(instanceId)
IF existingId IS PRESENT:
RETURN existingId
REPEAT:
assignedIds = loadAllAssignedWorkerIds()
PRECONDITION: assignedIds.size <= MAX_WORKER_ID + 1
availableIds = [0..MAX_WORKER_ID] - assignedIds
candidate = availableIds.pollSmallest()
reserved = reserveWorkerId(candidate, instanceId)
UNTIL reserved IS PRESENT
persistWorkerId(instanceId, reserved)
RETURN reserved
The retry loop handles the race condition where multiple nodes simultaneously attempt to claim the same worker ID. The atomic reservation via ReservationPersistService ensures only one node succeeds.
2. Ephemeral Node Registration
Using ephemeral nodes in the coordination service (ZooKeeper ephemeral znodes or etcd leased keys) provides automatic cleanup of stale node registrations:
FUNCTION registerOnline(computeNodeInstance):
// Ephemeral: auto-deleted when session expires
persistEphemeral(onlinePath(instance), serializeNodeData(instance))
persistEphemeral(statusPath(instanceId), instance.state)
persistEphemeral(labelPath(instanceId), serialize(instance.labels))
If a compute node crashes or loses its connection to the coordination service, the ephemeral nodes are automatically removed after the session timeout, and other nodes can detect the departure.
3. Cluster Membership Discovery
After registering itself, the node loads all existing online instances from the repository and populates its local cluster instance registry. This provides each node with a view of the current cluster membership:
FUNCTION registerOnline(instanceContext, param, contextManager):
computeNodeService.registerOnline(instance)
allInstances = computeNodeService.loadAllInstances()
instanceContext.clusterInstanceRegistry.addAll(allInstances)
4. Configuration Override Warning
In cluster mode, worker IDs are managed by the cluster coordination mechanism. If a user explicitly configures a worker-id property, the generator logs a warning indicating that the manually configured value is ignored in favor of the system-assigned one.