Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Openai Openai python Embeddings Create

From Leeroopedia
Knowledge Sources
Domains NLP, Embeddings, Semantic_Search
Last Updated 2026-02-15 00:00 GMT

Overview

Concrete tool for generating text embedding vectors with configurable dimensions provided by the OpenAI Python SDK.

Description

The Embeddings resource provides a create() method that generates dense vector representations of text. It supports batch embedding (multiple texts in one call), configurable output dimensions (text-embedding-3-* models), and optional base64 encoding. When base64 encoding is requested and numpy is available, the SDK automatically decodes vectors to numpy arrays.

Usage

Call client.embeddings.create() with text input and a model selection. Access vectors via response.data[i].embedding.

Code Reference

Source Location

  • Repository: openai-python
  • File: src/openai/resources/embeddings.py
  • Lines: L1-298

Signature

class Embeddings(SyncAPIResource):
    def create(
        self,
        *,
        input: Union[str, List[str], Iterable[int], Iterable[Iterable[int]]],
        model: Union[str, EmbeddingModel],
        dimensions: int | NotGiven = NOT_GIVEN,
        encoding_format: Literal["float", "base64"] | NotGiven = NOT_GIVEN,
        user: str | NotGiven = NOT_GIVEN,
    ) -> CreateEmbeddingResponse:
        """
        Creates an embedding vector representing the input text.

        Args:
            input: Text to embed (string, list of strings, or token arrays).
            model: Model ID (text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002).
            dimensions: Output dimensions (text-embedding-3-* only, truncates output).
            encoding_format: "float" (default) or "base64".
            user: End-user identifier for abuse monitoring.
        """

Import

from openai import OpenAI
# Access via client.embeddings.create()

I/O Contract

Inputs

Name Type Required Description
input str, list[str], Iterable[int], Iterable[Iterable[int]] Yes Text or token input to embed
model Union[str, EmbeddingModel] Yes text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002
dimensions int No Output vector dimensions (text-embedding-3-* only)
encoding_format str No "float" or "base64" (default "float")

Outputs

Name Type Description
response.data list[Embedding] List of embedding objects
response.data[i].embedding list[float] Float vector
response.data[i].index int Position matching input order
response.model str Model used
response.usage Usage Token usage (prompt_tokens, total_tokens)

Usage Examples

Single Embedding

from openai import OpenAI

client = OpenAI()
response = client.embeddings.create(
    input="The food was delicious and the service was great.",
    model="text-embedding-3-small",
)
embedding = response.data[0].embedding
print(f"Vector dimension: {len(embedding)}")
print(f"Tokens used: {response.usage.total_tokens}")

Batch Embeddings

texts = [
    "Machine learning is a subset of AI.",
    "Deep learning uses neural networks.",
    "Natural language processing handles text.",
]

response = client.embeddings.create(
    input=texts,
    model="text-embedding-3-small",
)

for item in response.data:
    print(f"Text {item.index}: vector[0:3] = {item.embedding[:3]}")

Cosine Similarity

import numpy as np

def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

response = client.embeddings.create(
    input=["cat", "dog", "car"],
    model="text-embedding-3-small",
)

vectors = [item.embedding for item in response.data]
print(f"cat-dog similarity: {cosine_similarity(vectors[0], vectors[1]):.3f}")
print(f"cat-car similarity: {cosine_similarity(vectors[0], vectors[2]):.3f}")

Dimension Reduction

# text-embedding-3-large default: 3072 dimensions
# Reduce to 256 for storage efficiency
response = client.embeddings.create(
    input="Some text",
    model="text-embedding-3-large",
    dimensions=256,
)
print(f"Vector length: {len(response.data[0].embedding)}")  # 256

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment