Implementation:Explodinggradients Ragas Llm Factory

Knowledge Sources	Domains	Last Updated
explodinggradients/ragas	LLM Integration, Factory Pattern	2026-02-10

Overview

The Llm Factory function creates unified LLM instances from pre-initialized provider clients, automatically selecting the appropriate adapter for structured output generation.

Description

The llm_factory() function (lines 606-748 of src/ragas/llms/base.py) is the primary entry point for creating LLM instances in Ragas. It accepts a model name, provider string, and a pre-built client object, then returns an InstructorBaseRagasLLM instance with generate() and agenerate() methods that accept Pydantic response models for structured output. The factory delegates client patching to the Instructor library via provider-specific integration functions (instructor.from_openai(), instructor.from_anthropic(), instructor.from_genai(), etc.) and handles adapter auto-detection, provider parameter mapping, and usage analytics tracking.

Usage

Use the Llm Factory when:

Creating an LLM instance for use with Ragas metrics or experiments
Working with any supported LLM provider (OpenAI, Anthropic, Google, Groq, Mistral, etc.)
Needing structured Pydantic model outputs from LLM calls
Wanting to cache LLM responses for cost reduction

Code Reference

Source Location: src/ragas/llms/base.py, lines 606-748

Signature:

def llm_factory(
    model: str,
    provider: str = "openai",
    client: Optional[Any] = None,
    adapter: str = "auto",
    cache: Optional[CacheInterface] = None,
    mode: Optional[instructor.Mode] = None,
    **kwargs: Any,
) -> InstructorBaseRagasLLM

Import:

from ragas.llms import llm_factory

Return Type Methods:

Method	Signature	Description
`generate`	`(prompt: str, response_model: Type[T]) -> T`	Synchronous structured output generation
`agenerate`	`(prompt: str, response_model: Type[T]) -> T`	Asynchronous structured output generation

I/O Contract

Inputs:

Parameter	Type	Required	Description
`model`	`str`	Yes	Model name (e.g., `"gpt-4o"`, `"claude-3-sonnet"`, `"gemini-2.0-flash"`)
`provider`	`str`	No	LLM provider identifier (default: `"openai"`). Supported: openai, anthropic, google, groq, mistral, cohere, xai, bedrock, deepseek, litellm, perplexity
`client`	`Optional[Any]`	Yes	Pre-initialized provider client (e.g., `OpenAI(...)`, `AsyncOpenAI(...)`, `Anthropic(...)`)
`adapter`	`str`	No	Structured output adapter (default: `"auto"`). Options: `"auto"`, `"instructor"`, `"litellm"`
`cache`	`Optional[CacheInterface]`	No	Cache backend for caching LLM responses (e.g., `DiskCacheBackend()`)
`mode`	`Optional[instructor.Mode]`	No	Instructor mode for structured outputs (default: `Mode.JSON`). Options include `Mode.MD_JSON`, `Mode.TOOLS`, `Mode.JSON_SCHEMA`
`**kwargs`	`Any`	No	Additional model arguments (temperature, max_tokens, top_p, etc.)

Outputs:

Output	Type	Description
LLM instance	`InstructorBaseRagasLLM`	Unified LLM with `generate()` and `agenerate()` methods accepting Pydantic response models

Raises:

Exception	Condition
`ValueError`	Client is `None`, model is empty, provider is unsupported, or adapter initialization fails

Usage Examples

Basic OpenAI usage:

from openai import OpenAI
from pydantic import BaseModel
from ragas.llms import llm_factory

client = OpenAI(api_key="your-api-key")
llm = llm_factory("gpt-4o-mini", client=client)

class QAEval(BaseModel):
    is_correct: bool
    reason: str

result = llm.generate(
    "Is this answer correct? Q: What is 2+2? A: 4",
    response_model=QAEval,
)
print(result.is_correct)  # True
print(result.reason)       # "The answer 4 is mathematically correct."

Anthropic provider:

from anthropic import Anthropic
from ragas.llms import llm_factory

client = Anthropic(api_key="your-api-key")
llm = llm_factory("claude-3-sonnet", provider="anthropic", client=client)

Async usage with caching:

from openai import AsyncOpenAI
from ragas.llms import llm_factory
from ragas.cache import DiskCacheBackend

client = AsyncOpenAI(api_key="your-api-key")
cache = DiskCacheBackend()
llm = llm_factory("gpt-4o-mini", client=client, cache=cache)

# Async call
result = await llm.agenerate("Evaluate this response...", response_model=EvalModel)

Custom Instructor mode for non-standard backends:

import instructor
from openai import OpenAI
from ragas.llms import llm_factory

client = OpenAI(api_key="your-key", base_url="https://custom-backend")
llm = llm_factory(
    "custom-model",
    client=client,
    mode=instructor.Mode.MD_JSON,
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment