Heuristic:Openai Openai node Retry Backoff Configuration
| Knowledge Sources | |
|---|---|
| Domains | Reliability, Networking |
| Last Updated | 2026-02-15 12:00 GMT |
Overview
Exponential backoff retry strategy with jitter that retries transient failures (429, 408, 409, 5xx) up to 2 times with delays from 0.5 to 8 seconds.
Description
The OpenAI SDK implements automatic retry logic with exponential backoff and jitter for all API requests. When a request fails with a transient error (rate limit, timeout, server error), the SDK waits an increasing amount of time before retrying. The backoff formula is `min(0.5 * 2^n, 8.0)` seconds, with a random jitter that reduces the delay by up to 25%. The SDK also respects server-provided `Retry-After` and non-standard `retry-after-ms` headers, capping server-requested waits at 60 seconds.
Usage
This heuristic applies to every API request made through the SDK. It is especially important when dealing with rate-limited endpoints or during periods of high API load. Understanding these defaults helps when tuning `maxRetries` and `timeout` for specific use cases (e.g., increasing retries for batch processing, reducing timeout for interactive UIs).
The Insight (Rule of Thumb)
- Default Retries: `maxRetries = 2` (total of 3 attempts including the initial request).
- Default Timeout: `timeout = 600000` ms (10 minutes per individual request).
- Backoff Formula: `min(0.5 * 2^numRetries, 8.0)` seconds, with 0-25% random jitter subtracted.
- Retry-After Cap: Server-provided `Retry-After` headers are respected only if the requested wait is < 60 seconds; otherwise the calculated backoff is used.
- Retried Status Codes: 408 (Request Timeout), 409 (Lock Timeout), 429 (Rate Limit), 5xx (Server Errors).
- Custom Header: The SDK proactively supports a non-standard `retry-after-ms` header for millisecond-precision waits.
- Trade-off: With 2 retries and 10-minute timeout, worst-case wait before failure is ~30 minutes (3 attempts x 10 min each). Reduce `timeout` for latency-sensitive applications.
Reasoning
Exponential backoff with jitter is the standard approach for avoiding thundering herd problems. The 0.5s initial delay and 8s cap provide a balance between quick recovery from transient blips and avoiding excessive load on the API server. The 25% jitter prevents synchronized retries from concurrent clients. The 60-second cap on server-requested waits prevents indefinite blocking from misbehaving upstream responses.
The retry decision also respects a custom `x-should-retry` response header, allowing the OpenAI API to explicitly signal whether a request should be retried regardless of status code.
Code Evidence
Backoff calculation from `src/client.ts:879-892`:
private calculateDefaultRetryTimeoutMillis(retriesRemaining: number, maxRetries: number): number {
const initialRetryDelay = 0.5;
const maxRetryDelay = 8.0;
const numRetries = maxRetries - retriesRemaining;
// Apply exponential backoff, but not more than the max.
const sleepSeconds = Math.min(initialRetryDelay * Math.pow(2, numRetries), maxRetryDelay);
// Apply some jitter, take up to at most 25 percent of the retry time.
const jitter = 1 - Math.random() * 0.25;
return sleepSeconds * jitter * 1000;
}
Retry-After header handling from `src/client.ts:848-873`:
// Note the `retry-after-ms` header may not be standard, but is a good idea
// and we'd like proactive support for it.
const retryAfterMillisHeader = responseHeaders?.get('retry-after-ms');
if (retryAfterMillisHeader) {
const timeoutMs = parseFloat(retryAfterMillisHeader);
if (!Number.isNaN(timeoutMs)) {
timeoutMillis = timeoutMs;
}
}
// If the API asks us to wait a certain amount of time (and it's a reasonable
// amount), just do what it says, but otherwise calculate a default
if (!(timeoutMillis && 0 <= timeoutMillis && timeoutMillis < 60 * 1000)) {
const maxRetries = options.maxRetries ?? this.maxRetries;
timeoutMillis = this.calculateDefaultRetryTimeoutMillis(retriesRemaining, maxRetries);
}
Default values from `src/client.ts:417,1004`:
this.maxRetries = options.maxRetries ?? 2;
// ...
static DEFAULT_TIMEOUT = 600000; // 10 minutes