Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Vibrantlabsai Ragas OpenAIEmbeddings

From Leeroopedia
Knowledge Sources
Domains Embeddings, OpenAI, LLM Evaluation
Last Updated 2026-02-12 00:00 GMT

Overview

OpenAIEmbeddings provides a direct OpenAI embedding implementation with automatic sync/async client detection, native batch optimization, and built-in usage analytics tracking.

Description

The OpenAIEmbeddings class extends BaseRagasEmbedding and wraps the official OpenAI Python client for embedding generation. It accepts either a synchronous (openai.OpenAI) or asynchronous (openai.AsyncOpenAI) client instance and automatically detects the client type at initialization via the _check_client_async method inherited from the base class.

Key features include:

  • Automatic client type detection - The is_async flag is set at initialization, routing calls to the appropriate sync or async code path
  • Native batch embedding - The embed_texts method leverages OpenAI's ability to accept multiple texts in a single API call, reducing HTTP overhead compared to per-text calls
  • Usage analytics tracking - Every embedding operation dispatches an EmbeddingUsageEvent via the Ragas analytics system, recording provider, model, request count, and async status
  • Graceful sync-from-async fallback - When a sync method is called with an async client, it uses _run_async_in_current_loop to bridge the gap; conversely, calling async methods with a sync client raises a TypeError with a clear message

The default model is text-embedding-3-small, which provides a good balance between quality and cost for most evaluation scenarios.

Usage

Use this class when you want direct OpenAI embedding integration without the overhead of a universal proxy layer like LiteLLM. It is the preferred choice when you are exclusively using OpenAI embeddings and want native batch API support for maximum throughput. Pass your existing OpenAI client instance to avoid creating redundant connections.

Code Reference

Source Location

Signature

class OpenAIEmbeddings(BaseRagasEmbedding):
    PROVIDER_NAME = "openai"
    REQUIRES_CLIENT = True
    DEFAULT_MODEL = "text-embedding-3-small"

    def __init__(
        self,
        client: Any,
        model: str = "text-embedding-3-small",
        cache: Optional[CacheInterface] = None,
    ): ...

Import

from ragas.embeddings.openai_provider import OpenAIEmbeddings

I/O Contract

Inputs

Name Type Required Description
client Any (OpenAI or AsyncOpenAI) Yes An instance of the OpenAI or AsyncOpenAI client from the openai library
model str No The OpenAI embedding model name; defaults to "text-embedding-3-small"
cache Optional[CacheInterface] No Cache backend for storing and retrieving embedding results

Outputs

embed_text / aembed_text

Name Type Description
return List[float] A list of floats representing the embedding vector for a single text

embed_texts / aembed_texts

Name Type Description
return List[List[float]] A list of embedding vectors, one per input text

Usage Examples

Synchronous Client Usage

from openai import OpenAI
from ragas.embeddings.openai_provider import OpenAIEmbeddings

client = OpenAI(api_key="sk-your-key-here")
embeddings = OpenAIEmbeddings(client=client)

# Embed a single text
vector = embeddings.embed_text("What is retrieval-augmented generation?")
print(len(vector))  # 1536

# Batch embed multiple texts (uses OpenAI's native batch API)
vectors = embeddings.embed_texts([
    "Machine learning basics",
    "Deep learning with transformers",
    "Natural language processing",
])
print(len(vectors))  # 3

Asynchronous Client Usage

import asyncio
from openai import AsyncOpenAI
from ragas.embeddings.openai_provider import OpenAIEmbeddings

async_client = AsyncOpenAI(api_key="sk-your-key-here")
embeddings = OpenAIEmbeddings(client=async_client)

async def main():
    vector = await embeddings.aembed_text("Async embedding example")
    vectors = await embeddings.aembed_texts(["Text 1", "Text 2"])
    return vector, vectors

result = asyncio.run(main())

Using a Different Model

from openai import OpenAI
from ragas.embeddings.openai_provider import OpenAIEmbeddings

client = OpenAI()
embeddings = OpenAIEmbeddings(
    client=client,
    model="text-embedding-3-large",
)

vector = embeddings.embed_text("Higher dimension embedding")
print(len(vector))  # 3072

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment