Principle:Arize ai Phoenix LLM Provider Configuration
| Knowledge Sources | |
|---|---|
| Domains | LLM Evaluation, AI Observability, Provider Abstraction |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
LLM provider configuration is the practice of abstracting the initialization and management of large language model clients behind a unified interface, enabling evaluations to be executed against any supported provider without changes to downstream logic.
Description
When building LLM evaluation pipelines, teams frequently need to swap between providers (OpenAI, Anthropic, Google, Azure OpenAI, and others) for reasons such as cost optimization, latency requirements, model capability differences, or organizational compliance mandates. Without a consistent abstraction layer, every provider change requires modifications across prompt construction, API call patterns, error handling, and rate limiting code.
LLM provider configuration solves this problem by introducing a single entry point that:
- Discovers available SDK adapters at runtime through a provider registry, so only installed SDK packages are offered.
- Normalizes the generation interface across providers, exposing uniform methods for text generation, structured object generation, and classification tasks.
- Encapsulates rate limiting by automatically creating rate limiter instances tuned to each provider's known throttling error types.
- Separates sync and async client construction, allowing callers to pass provider-specific options (timeouts, HTTP clients, base URLs) to each path independently.
The net effect is that evaluation code can be written once and re-targeted at any provider simply by changing the provider and model strings at initialization time.
Usage
Use LLM provider configuration whenever you need to:
- Initialize an LLM for use with Phoenix evaluation pipelines.
- Switch between providers (e.g., moving from OpenAI to Anthropic) without rewriting evaluator logic.
- Control request throughput by specifying an initial per-second request rate.
- Pass provider-specific SDK parameters such as API keys, base URLs, or timeout values.
- Run evaluations in both synchronous and asynchronous modes with distinct client options per mode.
Theoretical Basis
Provider Registry Pattern
The configuration follows the registry pattern: a central data structure (the provider registry) maps provider name strings to factory functions and adapter classes. At initialization time the registry is queried to discover:
provider_name --> [ProviderRegistration_1, ProviderRegistration_2, ...]
|
+-- client_factory(model, is_async, **kwargs) -> SDK client
+-- adapter_class(client, model) -> Adapter
+-- get_rate_limit_errors() -> List[Exception types]
This decouples the high-level evaluation API from the low-level SDK details of any single provider.
Adaptive Rate Limiting
Each provider registration declares its rate-limit error types. During initialization, a RateLimiter instance is created for each declared error type. The rate limiters wrap generation methods and apply exponential backoff when a rate-limit error is caught, adjusting the permitted per-second request rate adaptively. The initial rate can be explicitly set via the initial_per_second_request_rate parameter.
Sync/Async Client Duality
Many LLM SDKs maintain separate synchronous and asynchronous client classes (e.g., openai.OpenAI vs. openai.AsyncOpenAI). The configuration instantiates both at construction time. Shared keyword arguments are forwarded to both constructors, while sync_client_kwargs and async_client_kwargs allow caller-specific overrides (such as different timeout values) for each path. This enables evaluation pipelines to choose sync or async execution without re-initializing the LLM instance.