Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Togethercomputer Together python Embeddings Create

From Leeroopedia

Overview

Embeddings Create implements the Principle:Togethercomputer_Together_python_Embedding_Generation principle by providing the Embeddings.create() method for generating dense vector embeddings from text using the Together API.

API

Sync: Embeddings.create(*, input, model, **kwargs) -> EmbeddingResponse

Async: AsyncEmbeddings.create(*, input, model, **kwargs) -> EmbeddingResponse

Source

  • Sync implementation: src/together/resources/embeddings.py:L19-58
  • Async implementation: src/together/resources/embeddings.py:L65-104
  • Request type: src/together/types/embeddings.py:L11-15
  • Response types: src/together/types/embeddings.py:L18-35

Import

from together import Together

client = Together()
response = client.embeddings.create(
    input="Hello world",
    model="togethercomputer/m2-bert-80M-8k-retrieval",
)

Key Parameters

Parameter Type Default Description
input List[str] (required) A string or list of strings to embed
model str (required) The name of the embedding model to use

Inputs and Outputs

Input Type: EmbeddingRequest

class EmbeddingRequest(BaseModel):
    # input text or list of input texts
    input: str | List[str]
    # model to query
    model: str

Output Type: EmbeddingResponse

class EmbeddingResponse(BaseModel):
    # job id
    id: str | None = None
    # query model
    model: str | None = None
    # object type (always "list")
    object: Literal["list"] | None = None
    # list of embedding choices
    data: List[EmbeddingChoicesData] | None = None

Embedding Choice: EmbeddingChoicesData

class EmbeddingChoicesData(BaseModel):
    # response index (position in input list)
    index: int
    # object type (ObjectType.Embedding)
    object: ObjectType
    # embedding vector as list of floats
    embedding: List[float] | None = None

Internal Flow

The create() method follows this sequence:

  1. Constructs an EmbeddingRequest from the provided input and model parameters
  2. Serializes the request using .model_dump(exclude_none=True) to produce the API payload
  3. Creates an APIRequestor with the client configuration
  4. Sends a POST request to the embeddings endpoint via requestor.request() (sync) or requestor.arequest() (async)
  5. Deserializes the raw TogetherResponse into an EmbeddingResponse object

Usage Examples

Single Text Embedding

from together import Together

client = Together()

response = client.embeddings.create(
    input="What is retrieval-augmented generation?",
    model="togethercomputer/m2-bert-80M-8k-retrieval",
)

# Access the embedding vector
embedding = response.data[0].embedding
print(f"Embedding dimension: {len(embedding)}")
print(f"First 5 values: {embedding[:5]}")

Batch Embedding

from together import Together

client = Together()

texts = [
    "Machine learning is a subset of artificial intelligence.",
    "Deep learning uses neural networks with many layers.",
    "Natural language processing deals with text and speech.",
]

response = client.embeddings.create(
    input=texts,
    model="togethercomputer/m2-bert-80M-8k-retrieval",
)

# Access all embeddings (ordered by index)
for item in response.data:
    print(f"Index {item.index}: dimension={len(item.embedding)}")

Cosine Similarity Computation

from together import Together
import numpy as np

client = Together()

query = "How does RAG work?"
documents = [
    "RAG retrieves relevant documents and uses them as context for generation.",
    "The weather in Paris is typically mild in spring.",
    "Retrieval-augmented generation improves LLM accuracy with external knowledge.",
]

# Embed query and documents together
all_texts = [query] + documents
response = client.embeddings.create(
    input=all_texts,
    model="togethercomputer/m2-bert-80M-8k-retrieval",
)

# Extract vectors
vectors = [np.array(item.embedding) for item in response.data]
query_vec = vectors[0]
doc_vecs = vectors[1:]

# Compute cosine similarities
for i, doc_vec in enumerate(doc_vecs):
    similarity = np.dot(query_vec, doc_vec) / (
        np.linalg.norm(query_vec) * np.linalg.norm(doc_vec)
    )
    print(f"Document {i}: similarity={similarity:.4f}")

Async Embedding

import asyncio
from together import AsyncTogether

async def embed_texts():
    client = AsyncTogether()

    response = await client.embeddings.create(
        input=["async embedding example"],
        model="togethercomputer/m2-bert-80M-8k-retrieval",
    )

    print(f"Embedding dimension: {len(response.data[0].embedding)}")

asyncio.run(embed_texts())

Metadata

Property Value
Implementation Embeddings Create
API Embeddings.create() / AsyncEmbeddings.create()
Source src/together/resources/embeddings.py:L19-58 (sync), L65-104 (async)
HTTP Method POST
Endpoint embeddings
Domain NLP, Information_Retrieval, RAG
Workflow Embeddings_And_Reranking
Principle Principle:Togethercomputer_Together_python_Embedding_Generation

Knowledge Sources

2026-02-15 16:00 GMT

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment