Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Explodinggradients Ragas Llm Factory

From Leeroopedia


Knowledge Sources Domains Last Updated
explodinggradients/ragas LLM Integration, Factory Pattern 2026-02-10

Overview

The Llm Factory function creates unified LLM instances from pre-initialized provider clients, automatically selecting the appropriate adapter for structured output generation.

Description

The llm_factory() function (lines 606-748 of src/ragas/llms/base.py) is the primary entry point for creating LLM instances in Ragas. It accepts a model name, provider string, and a pre-built client object, then returns an InstructorBaseRagasLLM instance with generate() and agenerate() methods that accept Pydantic response models for structured output. The factory delegates client patching to the Instructor library via provider-specific integration functions (instructor.from_openai(), instructor.from_anthropic(), instructor.from_genai(), etc.) and handles adapter auto-detection, provider parameter mapping, and usage analytics tracking.

Usage

Use the Llm Factory when:

  • Creating an LLM instance for use with Ragas metrics or experiments
  • Working with any supported LLM provider (OpenAI, Anthropic, Google, Groq, Mistral, etc.)
  • Needing structured Pydantic model outputs from LLM calls
  • Wanting to cache LLM responses for cost reduction

Code Reference

Source Location: src/ragas/llms/base.py, lines 606-748

Signature:

def llm_factory(
    model: str,
    provider: str = "openai",
    client: Optional[Any] = None,
    adapter: str = "auto",
    cache: Optional[CacheInterface] = None,
    mode: Optional[instructor.Mode] = None,
    **kwargs: Any,
) -> InstructorBaseRagasLLM

Import:

from ragas.llms import llm_factory

Return Type Methods:

Method Signature Description
generate (prompt: str, response_model: Type[T]) -> T Synchronous structured output generation
agenerate (prompt: str, response_model: Type[T]) -> T Asynchronous structured output generation

I/O Contract

Inputs:

Parameter Type Required Description
model str Yes Model name (e.g., "gpt-4o", "claude-3-sonnet", "gemini-2.0-flash")
provider str No LLM provider identifier (default: "openai"). Supported: openai, anthropic, google, groq, mistral, cohere, xai, bedrock, deepseek, litellm, perplexity
client Optional[Any] Yes Pre-initialized provider client (e.g., OpenAI(...), AsyncOpenAI(...), Anthropic(...))
adapter str No Structured output adapter (default: "auto"). Options: "auto", "instructor", "litellm"
cache Optional[CacheInterface] No Cache backend for caching LLM responses (e.g., DiskCacheBackend())
mode Optional[instructor.Mode] No Instructor mode for structured outputs (default: Mode.JSON). Options include Mode.MD_JSON, Mode.TOOLS, Mode.JSON_SCHEMA
**kwargs Any No Additional model arguments (temperature, max_tokens, top_p, etc.)

Outputs:

Output Type Description
LLM instance InstructorBaseRagasLLM Unified LLM with generate() and agenerate() methods accepting Pydantic response models

Raises:

Exception Condition
ValueError Client is None, model is empty, provider is unsupported, or adapter initialization fails

Usage Examples

Basic OpenAI usage:

from openai import OpenAI
from pydantic import BaseModel
from ragas.llms import llm_factory

client = OpenAI(api_key="your-api-key")
llm = llm_factory("gpt-4o-mini", client=client)

class QAEval(BaseModel):
    is_correct: bool
    reason: str

result = llm.generate(
    "Is this answer correct? Q: What is 2+2? A: 4",
    response_model=QAEval,
)
print(result.is_correct)  # True
print(result.reason)       # "The answer 4 is mathematically correct."

Anthropic provider:

from anthropic import Anthropic
from ragas.llms import llm_factory

client = Anthropic(api_key="your-api-key")
llm = llm_factory("claude-3-sonnet", provider="anthropic", client=client)

Async usage with caching:

from openai import AsyncOpenAI
from ragas.llms import llm_factory
from ragas.cache import DiskCacheBackend

client = AsyncOpenAI(api_key="your-api-key")
cache = DiskCacheBackend()
llm = llm_factory("gpt-4o-mini", client=client, cache=cache)

# Async call
result = await llm.agenerate("Evaluate this response...", response_model=EvalModel)

Custom Instructor mode for non-standard backends:

import instructor
from openai import OpenAI
from ragas.llms import llm_factory

client = OpenAI(api_key="your-key", base_url="https://custom-backend")
llm = llm_factory(
    "custom-model",
    client=client,
    mode=instructor.Mode.MD_JSON,
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment