Implementation:Vibrantlabsai Ragas LiteLLMEmbeddings
| Knowledge Sources | |
|---|---|
| Domains | Embeddings, LiteLLM, LLM Evaluation, Multi-Provider |
| Last Updated | 2026-02-12 00:00 GMT |
Overview
LiteLLMEmbeddings provides a universal embedding interface through the LiteLLM library, supporting over 100 embedding models across OpenAI, Azure, Google, Cohere, Anthropic, and other providers.
Description
The LiteLLMEmbeddings class extends BaseRagasEmbedding and uses the LiteLLM library as a universal proxy for embedding API calls. This allows users to switch between embedding providers (OpenAI, Azure, Google, Cohere, and more) without changing their Ragas evaluation code.
Key features include:
- Universal provider support - Access any LiteLLM-supported embedding model through a unified interface using provider-prefixed model names (e.g., "openai/text-embedding-3-small", "cohere/embed-english-v3.0")
- Intelligent batching - Automatically determines optimal batch sizes via the get_optimal_batch_size utility, with manual override capability
- Configurable retry logic - Built-in retry support with configurable max_retries (default 3) and timeout (default 600 seconds)
- Provider-specific parameters - Supports api_key, api_base, and api_version for direct provider configuration, plus arbitrary litellm_params for advanced use cases
- Native async support - Both sync (litellm.embedding) and async (litellm.aembedding) methods are used directly
The _prepare_kwargs helper method centralizes parameter assembly, merging provider configuration, timeout settings, retry logic, and user-supplied keyword arguments into a single call dictionary.
Usage
Use this class when you need a single embedding interface that can work across multiple providers, or when you want to quickly swap between different embedding providers during evaluation experiments. It is ideal for teams that use multiple LLM providers and want consistent Ragas evaluation regardless of the underlying embedding service.
Code Reference
Source Location
- Repository: Vibrantlabsai_Ragas
- File: src/ragas/embeddings/litellm_provider.py
Signature
class LiteLLMEmbeddings(BaseRagasEmbedding):
PROVIDER_NAME = "litellm"
REQUIRES_MODEL = True
def __init__(
self,
model: str,
api_key: Optional[str] = None,
api_base: Optional[str] = None,
api_version: Optional[str] = None,
timeout: int = 600,
max_retries: int = 3,
batch_size: Optional[int] = None,
cache: Optional[CacheInterface] = None,
**litellm_params: Any,
): ...
Import
from ragas.embeddings.litellm_provider import LiteLLMEmbeddings
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model | str | Yes | The LiteLLM model identifier, typically provider-prefixed (e.g., "openai/text-embedding-3-small") |
| api_key | Optional[str] | No | API key for the embedding provider |
| api_base | Optional[str] | No | Custom API base URL for self-hosted or proxy endpoints |
| api_version | Optional[str] | No | API version string, used by providers like Azure |
| timeout | int | No | Timeout in seconds for API calls; defaults to 600 |
| max_retries | int | No | Maximum number of retry attempts for failed API calls; defaults to 3 |
| batch_size | Optional[int] | No | Number of texts to process per batch; auto-determined if not specified |
| cache | Optional[CacheInterface] | No | Cache backend for storing and retrieving embedding results |
| **litellm_params | Any | No | Additional keyword arguments passed directly to LiteLLM embedding calls |
Outputs
embed_text / aembed_text
| Name | Type | Description |
|---|---|---|
| return | List[float] | A list of floats representing the embedding vector for a single text |
embed_texts / aembed_texts
| Name | Type | Description |
|---|---|---|
| return | List[List[float]] | A list of embedding vectors, one per input text |
Usage Examples
Basic Usage with OpenAI
from ragas.embeddings.litellm_provider import LiteLLMEmbeddings
# Use OpenAI embeddings through LiteLLM
embeddings = LiteLLMEmbeddings(
model="openai/text-embedding-3-small",
api_key="sk-your-key-here",
)
vector = embeddings.embed_text("What is retrieval-augmented generation?")
print(len(vector)) # 1536
Azure OpenAI Usage
from ragas.embeddings.litellm_provider import LiteLLMEmbeddings
embeddings = LiteLLMEmbeddings(
model="azure/my-embedding-deployment",
api_key="your-azure-key",
api_base="https://your-resource.openai.azure.com/",
api_version="2024-02-01",
)
vectors = embeddings.embed_texts([
"Document about machine learning.",
"Document about natural language processing.",
])
Async Batch Embedding
import asyncio
from ragas.embeddings.litellm_provider import LiteLLMEmbeddings
embeddings = LiteLLMEmbeddings(
model="openai/text-embedding-3-small",
batch_size=50,
timeout=300,
max_retries=5,
)
async def embed_corpus(texts):
return await embeddings.aembed_texts(texts)
texts = ["Text 1", "Text 2", "Text 3"]
result = asyncio.run(embed_corpus(texts))