Heuristic:Openai Openai node Retry Backoff Configuration

Knowledge Sources	openai-node SDK source analysis
Domains	Reliability, Networking
Last Updated	2026-02-15 12:00 GMT

Overview

Exponential backoff retry strategy with jitter that retries transient failures (429, 408, 409, 5xx) up to 2 times with delays from 0.5 to 8 seconds.

Description

The OpenAI SDK implements automatic retry logic with exponential backoff and jitter for all API requests. When a request fails with a transient error (rate limit, timeout, server error), the SDK waits an increasing amount of time before retrying. The backoff formula is `min(0.5 * 2^n, 8.0)` seconds, with a random jitter that reduces the delay by up to 25%. The SDK also respects server-provided `Retry-After` and non-standard `retry-after-ms` headers, capping server-requested waits at 60 seconds.

Usage

This heuristic applies to every API request made through the SDK. It is especially important when dealing with rate-limited endpoints or during periods of high API load. Understanding these defaults helps when tuning `maxRetries` and `timeout` for specific use cases (e.g., increasing retries for batch processing, reducing timeout for interactive UIs).

The Insight (Rule of Thumb)

Default Retries: `maxRetries = 2` (total of 3 attempts including the initial request).
Default Timeout: `timeout = 600000` ms (10 minutes per individual request).
Backoff Formula: `min(0.5 * 2^numRetries, 8.0)` seconds, with 0-25% random jitter subtracted.
Retry-After Cap: Server-provided `Retry-After` headers are respected only if the requested wait is < 60 seconds; otherwise the calculated backoff is used.
Retried Status Codes: 408 (Request Timeout), 409 (Lock Timeout), 429 (Rate Limit), 5xx (Server Errors).
Custom Header: The SDK proactively supports a non-standard `retry-after-ms` header for millisecond-precision waits.
Trade-off: With 2 retries and 10-minute timeout, worst-case wait before failure is ~30 minutes (3 attempts x 10 min each). Reduce `timeout` for latency-sensitive applications.

Reasoning

Exponential backoff with jitter is the standard approach for avoiding thundering herd problems. The 0.5s initial delay and 8s cap provide a balance between quick recovery from transient blips and avoiding excessive load on the API server. The 25% jitter prevents synchronized retries from concurrent clients. The 60-second cap on server-requested waits prevents indefinite blocking from misbehaving upstream responses.

The retry decision also respects a custom `x-should-retry` response header, allowing the OpenAI API to explicitly signal whether a request should be retried regardless of status code.

Code Evidence

Backoff calculation from `src/client.ts:879-892`:

private calculateDefaultRetryTimeoutMillis(retriesRemaining: number, maxRetries: number): number {
    const initialRetryDelay = 0.5;
    const maxRetryDelay = 8.0;

    const numRetries = maxRetries - retriesRemaining;

    // Apply exponential backoff, but not more than the max.
    const sleepSeconds = Math.min(initialRetryDelay * Math.pow(2, numRetries), maxRetryDelay);

    // Apply some jitter, take up to at most 25 percent of the retry time.
    const jitter = 1 - Math.random() * 0.25;

    return sleepSeconds * jitter * 1000;
}

Retry-After header handling from `src/client.ts:848-873`:

// Note the `retry-after-ms` header may not be standard, but is a good idea
// and we'd like proactive support for it.
const retryAfterMillisHeader = responseHeaders?.get('retry-after-ms');
if (retryAfterMillisHeader) {
    const timeoutMs = parseFloat(retryAfterMillisHeader);
    if (!Number.isNaN(timeoutMs)) {
        timeoutMillis = timeoutMs;
    }
}

// If the API asks us to wait a certain amount of time (and it's a reasonable
// amount), just do what it says, but otherwise calculate a default
if (!(timeoutMillis && 0 <= timeoutMillis && timeoutMillis < 60 * 1000)) {
    const maxRetries = options.maxRetries ?? this.maxRetries;
    timeoutMillis = this.calculateDefaultRetryTimeoutMillis(retriesRemaining, maxRetries);
}

Default values from `src/client.ts:417,1004`:

this.maxRetries = options.maxRetries ?? 2;
// ...
static DEFAULT_TIMEOUT = 600000; // 10 minutes

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment