Implementation:Explodinggradients Ragas Llm Factory
| Knowledge Sources | Domains | Last Updated |
|---|---|---|
| explodinggradients/ragas | LLM Integration, Factory Pattern | 2026-02-10 |
Overview
The Llm Factory function creates unified LLM instances from pre-initialized provider clients, automatically selecting the appropriate adapter for structured output generation.
Description
The llm_factory() function (lines 606-748 of src/ragas/llms/base.py) is the primary entry point for creating LLM instances in Ragas. It accepts a model name, provider string, and a pre-built client object, then returns an InstructorBaseRagasLLM instance with generate() and agenerate() methods that accept Pydantic response models for structured output. The factory delegates client patching to the Instructor library via provider-specific integration functions (instructor.from_openai(), instructor.from_anthropic(), instructor.from_genai(), etc.) and handles adapter auto-detection, provider parameter mapping, and usage analytics tracking.
Usage
Use the Llm Factory when:
- Creating an LLM instance for use with Ragas metrics or experiments
- Working with any supported LLM provider (OpenAI, Anthropic, Google, Groq, Mistral, etc.)
- Needing structured Pydantic model outputs from LLM calls
- Wanting to cache LLM responses for cost reduction
Code Reference
Source Location: src/ragas/llms/base.py, lines 606-748
Signature:
def llm_factory(
model: str,
provider: str = "openai",
client: Optional[Any] = None,
adapter: str = "auto",
cache: Optional[CacheInterface] = None,
mode: Optional[instructor.Mode] = None,
**kwargs: Any,
) -> InstructorBaseRagasLLM
Import:
from ragas.llms import llm_factory
Return Type Methods:
| Method | Signature | Description |
|---|---|---|
generate |
(prompt: str, response_model: Type[T]) -> T |
Synchronous structured output generation |
agenerate |
(prompt: str, response_model: Type[T]) -> T |
Asynchronous structured output generation |
I/O Contract
Inputs:
| Parameter | Type | Required | Description |
|---|---|---|---|
model |
str |
Yes | Model name (e.g., "gpt-4o", "claude-3-sonnet", "gemini-2.0-flash")
|
provider |
str |
No | LLM provider identifier (default: "openai"). Supported: openai, anthropic, google, groq, mistral, cohere, xai, bedrock, deepseek, litellm, perplexity
|
client |
Optional[Any] |
Yes | Pre-initialized provider client (e.g., OpenAI(...), AsyncOpenAI(...), Anthropic(...))
|
adapter |
str |
No | Structured output adapter (default: "auto"). Options: "auto", "instructor", "litellm"
|
cache |
Optional[CacheInterface] |
No | Cache backend for caching LLM responses (e.g., DiskCacheBackend())
|
mode |
Optional[instructor.Mode] |
No | Instructor mode for structured outputs (default: Mode.JSON). Options include Mode.MD_JSON, Mode.TOOLS, Mode.JSON_SCHEMA
|
**kwargs |
Any |
No | Additional model arguments (temperature, max_tokens, top_p, etc.) |
Outputs:
| Output | Type | Description |
|---|---|---|
| LLM instance | InstructorBaseRagasLLM |
Unified LLM with generate() and agenerate() methods accepting Pydantic response models
|
Raises:
| Exception | Condition |
|---|---|
ValueError |
Client is None, model is empty, provider is unsupported, or adapter initialization fails
|
Usage Examples
Basic OpenAI usage:
from openai import OpenAI
from pydantic import BaseModel
from ragas.llms import llm_factory
client = OpenAI(api_key="your-api-key")
llm = llm_factory("gpt-4o-mini", client=client)
class QAEval(BaseModel):
is_correct: bool
reason: str
result = llm.generate(
"Is this answer correct? Q: What is 2+2? A: 4",
response_model=QAEval,
)
print(result.is_correct) # True
print(result.reason) # "The answer 4 is mathematically correct."
Anthropic provider:
from anthropic import Anthropic
from ragas.llms import llm_factory
client = Anthropic(api_key="your-api-key")
llm = llm_factory("claude-3-sonnet", provider="anthropic", client=client)
Async usage with caching:
from openai import AsyncOpenAI
from ragas.llms import llm_factory
from ragas.cache import DiskCacheBackend
client = AsyncOpenAI(api_key="your-api-key")
cache = DiskCacheBackend()
llm = llm_factory("gpt-4o-mini", client=client, cache=cache)
# Async call
result = await llm.agenerate("Evaluate this response...", response_model=EvalModel)
Custom Instructor mode for non-standard backends:
import instructor
from openai import OpenAI
from ragas.llms import llm_factory
client = OpenAI(api_key="your-key", base_url="https://custom-backend")
llm = llm_factory(
"custom-model",
client=client,
mode=instructor.Mode.MD_JSON,
)
Related Pages
- Principle:Explodinggradients_Ragas_LLM_Provider_Abstraction
- Environment:Explodinggradients_Ragas_Python_Runtime_Environment
- Environment:Explodinggradients_Ragas_LLM_Provider_Environment
- Heuristic:Explodinggradients_Ragas_Retry_And_Backoff_Configuration
- Heuristic:Explodinggradients_Ragas_LLM_Temperature_Defaults
- Heuristic:Explodinggradients_Ragas_Embedding_Batch_Size_Tuning