Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Heuristic:Promptfoo Promptfoo Transient Error Classification

From Leeroopedia
Knowledge Sources
Domains Error_Handling, Network
Last Updated 2026-02-14 08:00 GMT

Overview

Error classification strategy that distinguishes transient connection errors (worth retrying) from permanent configuration errors (will never succeed) by checking `error.code` first, then message patterns.

Description

When an HTTP request fails, promptfoo must decide whether to retry or fail immediately. The `isTransientConnectionError()` function classifies errors into two categories: transient failures that may succeed on retry (stale connections, mid-stream resets) and permanent failures that indicate misconfiguration (wrong certificates, HTTPS-to-HTTP mismatch). This classification is critical because retrying permanent errors wastes time and obscures the root cause.

The key insight is to check `error.code` first (more robust across Node.js versions) rather than parsing error messages, since system errors always set `.code` consistently.

Usage

This heuristic applies to all network operations in promptfoo, including LLM API calls, webhook callbacks, and cache fetches. It is built into the retry policy and should not need manual configuration.

The Insight (Rule of Thumb)

  • Action: Check `error.code` property first for classification; fall back to message parsing only for codes without a `.code` property.
  • Retryable (transient):
    • `ECONNRESET` - Connection reset by peer
    • `EPIPE` - Broken pipe (write to closed connection)
    • `EPROTO` - Protocol error (unless paired with TLS config phrases)
    • `socket hang up` - Connection dropped mid-transfer
    • `bad record mac` - TLS record corruption (transient)
  • NOT retryable (permanent):
    • `self signed certificate` - Server cert not trusted
    • `unable to verify` / `unknown ca` - CA chain broken
    • `wrong version number` - HTTPS -> HTTP protocol mismatch
    • Any `EPROTO` paired with certificate-related messages
  • Trade-off: Conservative classification means some borderline errors fail instead of retrying, but this prevents minutes of futile retries on misconfigured endpoints.

Reasoning

From `src/util/fetch/errors.ts:16-47`:

/**
 * Detect transient connection errors distinct from rate limits or permanent
 * certificate/config errors. Only matches errors that are likely to succeed
 * on retry (stale connections, mid-stream resets). Permanent failures like
 * "self signed certificate", "unable to verify", "unknown ca", or
 * "wrong version number" (HTTPS->HTTP mismatch) are intentionally excluded.
 */
export function isTransientConnectionError(error: Error | undefined): boolean {
  // Check error.code first — more robust across Node.js versions than
  // parsing error messages, since system errors always set .code.
  const code = (error as SystemError).code;
  if (code === 'ECONNRESET' || code === 'EPIPE') {
    return true;
  }

  const message = (error.message ?? '').toLowerCase();
  // EPROTO can wrap permanent TLS misconfigs. Exclude when paired with
  // known permanent error phrases to avoid futile retries.
  if (message.includes('eproto') &&
      (message.includes('wrong version number') ||
       message.includes('self signed') ||
       message.includes('unable to verify') ||
       message.includes('unknown ca') ||
       message.includes('cert'))) {
    return false;
  }
  return (
    message.includes('bad record mac') ||
    message.includes('eproto') ||
    message.includes('econnreset') ||
    message.includes('socket hang up')
  );
}

The retry policy in `src/scheduler/retryPolicy.ts:65-76` adds higher-level transient errors:

return (
  isTransientConnectionError(error) ||
  message.includes('timeout') ||
  message.includes('econnrefused') ||
  message.includes('network') ||
  message.includes('503') ||
  message.includes('502') ||
  message.includes('504')
);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment