Heuristic:Microsoft Semantic kernel Custom HttpClient Retry Policy

Knowledge Sources	Semantic Kernel
Domains	Resilience, Networking
Last Updated	2026-02-11 20:00 GMT

Overview

When providing a custom HttpClient to OpenAI or Azure OpenAI connectors, Semantic Kernel automatically disables its built-in retry policy and default timeout to avoid conflicts with external resilience handlers.

Description

The OpenAI and Azure OpenAI connectors in Semantic Kernel include built-in retry and timeout policies via the underlying OpenAI client library. However, when a developer provides a custom HttpClient (typically configured with Microsoft.Extensions.Http.Resilience or Polly policies), these built-in policies would conflict with the external resilience configuration. To prevent double-retry and timeout conflicts, Semantic Kernel sets maxRetries: 0 and NetworkTimeout: InfiniteTimeSpan when a custom HttpClient is detected.

Usage

Use this heuristic when providing a custom HttpClient to any OpenAI or Azure OpenAI connector registration method. This applies to scenarios using DI-registered HttpClients, named HttpClients with Polly policies, or any resilience middleware. Failing to understand this behavior can result in unexpected timeout or retry characteristics.

The Insight (Rule of Thumb)

Action: When providing a custom HttpClient, configure all retry and timeout policies on that HttpClient. The connector's built-in policies will be disabled automatically.
Value: maxRetries: 0, NetworkTimeout: InfiniteTimeSpan
Trade-off: Full control over resilience behavior, but you must ensure your HttpClient has appropriate retry and timeout policies configured. Without them, requests may hang indefinitely or never retry on transient failures.

Reasoning

Double-retry is a well-known anti-pattern in distributed systems. If both the inner client library and the outer HttpClient handler retry on failures, the total number of attempts multiplies (e.g., 3 inner retries × 3 outer retries = 9 total attempts), causing excessive load on the AI service and potentially triggering rate limiting. Similarly, competing timeouts can cause premature cancellation (if the inner timeout fires before the outer resilience policy has a chance to retry) or delayed failure detection.

By disabling built-in policies when a custom HttpClient is present, Semantic Kernel ensures a single layer of resilience control, which is the standard .NET pattern for HttpClient factories and DI-configured clients.

Code Evidence

Retry and timeout disable logic from dotnet/src/Connectors/Connectors.OpenAI/Core/ClientCore.cs:214-215:

options.RetryPolicy = new ClientRetryPolicy(maxRetries: 0); // Disable retry policy if and only if a custom HttpClient is provided.
options.NetworkTimeout = Timeout.InfiniteTimeSpan; // Disable default timeout

Same pattern in Azure OpenAI connector from dotnet/src/Connectors/Connectors.AzureOpenAI/Core/AzureClientCore.cs:158-159:

options.RetryPolicy = new ClientRetryPolicy(maxRetries: 0);
options.NetworkTimeout = Timeout.InfiniteTimeSpan;

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment