Implementation:Vibrantlabsai Ragas OpenAIEmbeddings

Knowledge Sources	Vibrantlabsai_Ragas
Domains	Embeddings, OpenAI, LLM Evaluation
Last Updated	2026-02-12 00:00 GMT

Overview

OpenAIEmbeddings provides a direct OpenAI embedding implementation with automatic sync/async client detection, native batch optimization, and built-in usage analytics tracking.

Description

The OpenAIEmbeddings class extends BaseRagasEmbedding and wraps the official OpenAI Python client for embedding generation. It accepts either a synchronous (openai.OpenAI) or asynchronous (openai.AsyncOpenAI) client instance and automatically detects the client type at initialization via the _check_client_async method inherited from the base class.

Key features include:

Automatic client type detection - The is_async flag is set at initialization, routing calls to the appropriate sync or async code path
Native batch embedding - The embed_texts method leverages OpenAI's ability to accept multiple texts in a single API call, reducing HTTP overhead compared to per-text calls
Usage analytics tracking - Every embedding operation dispatches an EmbeddingUsageEvent via the Ragas analytics system, recording provider, model, request count, and async status
Graceful sync-from-async fallback - When a sync method is called with an async client, it uses _run_async_in_current_loop to bridge the gap; conversely, calling async methods with a sync client raises a TypeError with a clear message

The default model is text-embedding-3-small, which provides a good balance between quality and cost for most evaluation scenarios.

Usage

Use this class when you want direct OpenAI embedding integration without the overhead of a universal proxy layer like LiteLLM. It is the preferred choice when you are exclusively using OpenAI embeddings and want native batch API support for maximum throughput. Pass your existing OpenAI client instance to avoid creating redundant connections.

Code Reference

Source Location

Repository: Vibrantlabsai_Ragas
File: src/ragas/embeddings/openai_provider.py

Signature

class OpenAIEmbeddings(BaseRagasEmbedding):
    PROVIDER_NAME = "openai"
    REQUIRES_CLIENT = True
    DEFAULT_MODEL = "text-embedding-3-small"

    def __init__(
        self,
        client: Any,
        model: str = "text-embedding-3-small",
        cache: Optional[CacheInterface] = None,
    ): ...

Import

from ragas.embeddings.openai_provider import OpenAIEmbeddings

I/O Contract

Inputs

Name	Type	Required	Description
client	Any (OpenAI or AsyncOpenAI)	Yes	An instance of the OpenAI or AsyncOpenAI client from the openai library
model	str	No	The OpenAI embedding model name; defaults to "text-embedding-3-small"
cache	Optional[CacheInterface]	No	Cache backend for storing and retrieving embedding results

Outputs

embed_text / aembed_text

Name	Type	Description
return	List[float]	A list of floats representing the embedding vector for a single text

embed_texts / aembed_texts

Name	Type	Description
return	List[List[float]]	A list of embedding vectors, one per input text

Usage Examples

Synchronous Client Usage

from openai import OpenAI
from ragas.embeddings.openai_provider import OpenAIEmbeddings

client = OpenAI(api_key="sk-your-key-here")
embeddings = OpenAIEmbeddings(client=client)

# Embed a single text
vector = embeddings.embed_text("What is retrieval-augmented generation?")
print(len(vector))  # 1536

# Batch embed multiple texts (uses OpenAI's native batch API)
vectors = embeddings.embed_texts([
    "Machine learning basics",
    "Deep learning with transformers",
    "Natural language processing",
])
print(len(vectors))  # 3

Asynchronous Client Usage

import asyncio
from openai import AsyncOpenAI
from ragas.embeddings.openai_provider import OpenAIEmbeddings

async_client = AsyncOpenAI(api_key="sk-your-key-here")
embeddings = OpenAIEmbeddings(client=async_client)

async def main():
    vector = await embeddings.aembed_text("Async embedding example")
    vectors = await embeddings.aembed_texts(["Text 1", "Text 2"])
    return vector, vectors

result = asyncio.run(main())

Using a Different Model

from openai import OpenAI
from ragas.embeddings.openai_provider import OpenAIEmbeddings

client = OpenAI()
embeddings = OpenAIEmbeddings(
    client=client,
    model="text-embedding-3-large",
)

vector = embeddings.embed_text("Higher dimension embedding")
print(len(vector))  # 3072

Related Pages

Environment:Vibrantlabsai_Ragas_Python_3_9_Core_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment