Heuristic:Openclaw Openclaw Retry With Exponential Backoff

Knowledge Sources	OpenClaw Production retry patterns
Domains	Reliability, Networking
Last Updated	2026-02-06 12:00 GMT

Overview

Retry strategy using exponential backoff with configurable jitter, defaulting to 3 attempts with 300ms-30s delay range for all external API calls and provider interactions.

Description

OpenClaw implements a centralized retry utility (`retryAsync`) that applies exponential backoff to any asynchronous operation. The default configuration uses 3 attempts with a minimum delay of 300ms, maximum delay of 30 seconds, and no jitter. The delay doubles on each attempt (300ms, 600ms, 1200ms...) capped at the maximum. An optional `retryAfterMs` callback extracts vendor-specific retry headers (e.g., HTTP 429 Retry-After) to honor server-mandated delays. Jitter is applied as a +/- multiplier to prevent thundering herd effects when many clients retry simultaneously.

Usage

Apply this heuristic when implementing any external API call (model providers, channel APIs, webhook delivery). The default 3-attempt configuration is suitable for most operations. Increase attempts for critical operations (e.g., message delivery); decrease for latency-sensitive operations (e.g., health checks).

The Insight (Rule of Thumb)

Action: Wrap external calls with `retryAsync(fn, { attempts: 3, minDelayMs: 300, maxDelayMs: 30_000 })`.
Value: 3 attempts, 300ms base delay, 30s cap, exponential growth (2^n).
Trade-off: Retries add latency (up to ~31s worst case for 3 attempts). For real-time operations, reduce maxDelayMs or attempts.
Jitter: Enable `jitter: 0.5` when multiple concurrent clients may retry the same endpoint to avoid thundering herd.
Vendor Hints: Use `retryAfterMs` callback to extract and honor HTTP Retry-After headers (e.g., 429 responses).

Reasoning

Exponential backoff is the industry-standard approach for transient failures. The 300ms base delay avoids overwhelming services during brief outages while keeping recovery fast. The 30s cap prevents excessively long waits. The configurable `shouldRetry` predicate allows skipping retries for permanent errors (4xx status codes) while retrying transient ones (5xx, network errors). The `retryAfterMs` integration ensures compliance with rate-limiting APIs like Microsoft Teams and Telegram.

Code Evidence from `src/infra/retry.ts:25-30`:

const DEFAULT_RETRY_CONFIG = {
  attempts: 3,
  minDelayMs: 300,
  maxDelayMs: 30_000,
  jitter: 0,
};

Jitter application from `src/infra/retry.ts:62-68`:

function applyJitter(delayMs: number, jitter: number): number {
  if (jitter <= 0) {
    return delayMs;
  }
  const offset = (Math.random() * 2 - 1) * jitter;
  return Math.max(0, Math.round(delayMs * (1 + offset)));
}

Retry-After integration from `src/infra/retry.ts:115-119`:

const retryAfterMs = options.retryAfterMs?.(err);
const hasRetryAfter = typeof retryAfterMs === "number" && Number.isFinite(retryAfterMs);
const baseDelay = hasRetryAfter
  ? Math.max(retryAfterMs, minDelayMs)
  : minDelayMs * 2 ** (attempt - 1);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment