Principle:Explodinggradients Ragas LLM Provider Abstraction

Knowledge Sources	Domains	Last Updated
explodinggradients/ragas	LLM Integration, Factory Pattern	2026-02-10

Overview

LLM Provider Abstraction is the principle of providing a unified factory interface for accessing different LLM providers through automatic adapter selection and structured output generation.

Description

LLM evaluation toolkits must interact with many different LLM providers (OpenAI, Anthropic, Google, Groq, Mistral, and others), each with their own client libraries, API conventions, and parameter requirements. LLM Provider Abstraction addresses this fragmentation by establishing a single entry point that resolves provider differences behind a common interface.

Factory Pattern: A single factory function accepts a model name, provider identifier, and a pre-initialized client instance, then returns a unified LLM object. This eliminates the need for users to learn provider-specific wrapper classes or configuration patterns. The factory handles all the wiring internally.

Structured Output via Instructor: The abstraction leverages the Instructor library to enable structured output generation. Instead of parsing raw text responses, users pass Pydantic models as response schemas, and the LLM returns validated, typed objects. The factory patches the provider client with the appropriate Instructor integration automatically.

Adapter Auto-Discovery: The factory supports an "auto" adapter mode that inspects the client and provider to select the best structured output adapter. For most providers, this defaults to the Instructor adapter. For Google Gemini models, it automatically selects the LiteLLM adapter. Users can override this with explicit adapter selection when needed.

Provider-Specific Parameter Mapping: Different providers have different parameter conventions. For example, Google models require parameters wrapped in a generation_config dictionary with max_output_tokens instead of max_tokens. OpenAI reasoning models (o-series, GPT-5+) require max_completion_tokens and fixed temperature. The abstraction handles these mappings transparently.

Sync and Async Support: The returned LLM object provides both generate() and agenerate() methods. Async client detection happens at initialization time, so the correct execution path is chosen automatically. For sync callers with async clients, the system handles event loop management including Jupyter notebook compatibility.

Usage

Use the LLM Provider Abstraction principle when:

Integrating any LLM provider into a Ragas evaluation pipeline
Needing structured (Pydantic model) outputs from LLM calls
Switching between providers without changing evaluation code
Working with both synchronous and asynchronous calling patterns
Caching LLM responses to reduce costs during repeated evaluations

Theoretical Basis

The theoretical foundation combines the Factory Method Pattern with the Adapter Pattern:

PROCEDURE llm_factory(model, provider, client, adapter, cache):
    1. Validate inputs:
       REQUIRE client is not None
       REQUIRE model is not empty

    2. Auto-detect adapter if adapter == "auto":
       Inspect the client type and provider name
       IF provider is Google/Gemini:
           Select LiteLLM adapter
       ELSE:
           Select Instructor adapter

    3. Create the LLM instance via the selected adapter:
       The adapter patches the client for structured output
       The adapter wraps the client in a unified interface

    4. Provider parameter mapping occurs at call time:
       CASE provider is "google":
           Wrap parameters in generation_config
           Rename max_tokens to max_output_tokens
       CASE provider is "openai" AND model is reasoning model:
           Rename max_tokens to max_completion_tokens
           Force temperature to 1.0
           Remove unsupported top_p parameter
       DEFAULT:
           Pass parameters through unchanged

    5. Return the LLM instance with:
       - generate(prompt, response_model) for sync calls
       - agenerate(prompt, response_model) for async calls

This design ensures that provider-specific complexity is contained within the factory and adapter layer, keeping user-facing evaluation code provider-agnostic.

Related Pages

Implementation:Explodinggradients_Ragas_Llm_Factory

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment