Implementation:Apache Dolphinscheduler RpcMethodRetryStrategy Pattern
| Knowledge Sources | |
|---|---|
| Domains | Distributed_Systems, Fault_Tolerance |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Concrete tool for configuring RPC retry behavior using @RpcMethodRetryStrategy annotation and NettyRemotingClient's channel recovery mechanisms.
Description
@RpcMethodRetryStrategy is an annotation embedded within @RpcMethod that configures retry behavior for individual RPC methods. NettyRemotingClient.sendSync() implements the retry loop, catching RemoteException and RemoteTimeoutException and retrying according to the strategy. On channel failure, onChannelInactive() removes the dead channel from cache, and the next retry attempt creates a fresh channel via getOrCreateChannel().
Usage
Add @RpcMethod(retry = @RpcMethodRetryStrategy(...)) to configure retry behavior on specific RPC methods. Channel recovery is automatic and requires no configuration.
Code Reference
Source Location
- Repository: dolphinscheduler
- File: dolphinscheduler-extract/dolphinscheduler-extract-base/src/main/java/org/apache/dolphinscheduler/extract/base/client/NettyRemotingClient.java (L115-158)
- File: dolphinscheduler-extract/dolphinscheduler-extract-base/src/main/java/org/apache/dolphinscheduler/extract/base/RpcMethod.java (L31-33)
Signature
@Target(ElementType.METHOD)
@Retention(RetentionPolicy.RUNTIME)
public @interface RpcMethod {
long timeout() default -1;
RpcMethodRetryStrategy retry() default @RpcMethodRetryStrategy;
}
// Channel recovery methods in NettyRemotingClient
public Channel getOrCreateChannel(Host host); // reconnects if channel inactive
public void onChannelInactive(Host host); // removes dead channel from cache
// Metrics
ClientSyncDurationMetrics.record(duration);
ClientSyncExceptionMetrics.record(exception);
Import
import org.apache.dolphinscheduler.extract.base.RpcMethod;
import org.apache.dolphinscheduler.extract.base.RpcMethodRetryStrategy;
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| retry annotation | @RpcMethodRetryStrategy | No | Retry configuration (uses defaults if not specified) |
| Failed RPC attempt | RemoteException/RemoteTimeoutException | Yes | Trigger for retry logic |
Outputs
| Name | Type | Description |
|---|---|---|
| Retried response | IRpcResponse | Successful response after retry |
| RemoteException | Exception | Final failure after all retries exhausted |
| Metrics | Counters/Histograms | Duration and exception metrics for observability |
Usage Examples
Configuring Retry on RPC Method
@RpcService
public interface IWorkflowControlClient {
@RpcMethod(timeout = 10000, retry = @RpcMethodRetryStrategy)
WorkflowManualTriggerResponse manualTriggerWorkflow(
WorkflowManualTriggerRequest request);
@RpcMethod // default timeout and retry
WorkflowInstanceStopResponse stopWorkflowInstance(
WorkflowInstanceStopRequest request);
}
Channel Recovery Behavior
// When a channel becomes inactive (server crash, network issue):
// 1. Netty fires channelInactive event
// 2. NettyRemotingClient.onChannelInactive(host) removes channel from cache
// 3. Next RPC call to same host triggers getOrCreateChannel(host)
// 4. createChannel(host) establishes new connection
// 5. RPC call proceeds on fresh channel