Implementation:Togethercomputer Together python Embeddings Create

Overview

Embeddings Create implements the Principle:Togethercomputer_Together_python_Embedding_Generation principle by providing the Embeddings.create() method for generating dense vector embeddings from text using the Together API.

API

Sync: Embeddings.create(*, input, model, **kwargs) -> EmbeddingResponse

Async: AsyncEmbeddings.create(*, input, model, **kwargs) -> EmbeddingResponse

Source

Sync implementation: src/together/resources/embeddings.py:L19-58
Async implementation: src/together/resources/embeddings.py:L65-104
Request type: src/together/types/embeddings.py:L11-15
Response types: src/together/types/embeddings.py:L18-35

Import

from together import Together

client = Together()
response = client.embeddings.create(
    input="Hello world",
    model="togethercomputer/m2-bert-80M-8k-retrieval",
)

Key Parameters

Parameter	Type	Default	Description
`input`	List[str]	(required)	A string or list of strings to embed
`model`	`str`	(required)	The name of the embedding model to use

Inputs and Outputs

Input Type: EmbeddingRequest

class EmbeddingRequest(BaseModel):
    # input text or list of input texts
    input: str | List[str]
    # model to query
    model: str

Output Type: EmbeddingResponse

class EmbeddingResponse(BaseModel):
    # job id
    id: str | None = None
    # query model
    model: str | None = None
    # object type (always "list")
    object: Literal["list"] | None = None
    # list of embedding choices
    data: List[EmbeddingChoicesData] | None = None

Embedding Choice: EmbeddingChoicesData

class EmbeddingChoicesData(BaseModel):
    # response index (position in input list)
    index: int
    # object type (ObjectType.Embedding)
    object: ObjectType
    # embedding vector as list of floats
    embedding: List[float] | None = None

Internal Flow

The create() method follows this sequence:

Constructs an EmbeddingRequest from the provided input and model parameters
Serializes the request using .model_dump(exclude_none=True) to produce the API payload
Creates an APIRequestor with the client configuration
Sends a POST request to the embeddings endpoint via requestor.request() (sync) or requestor.arequest() (async)
Deserializes the raw TogetherResponse into an EmbeddingResponse object

Usage Examples

Single Text Embedding

from together import Together

client = Together()

response = client.embeddings.create(
    input="What is retrieval-augmented generation?",
    model="togethercomputer/m2-bert-80M-8k-retrieval",
)

# Access the embedding vector
embedding = response.data[0].embedding
print(f"Embedding dimension: {len(embedding)}")
print(f"First 5 values: {embedding[:5]}")

Batch Embedding

from together import Together

client = Together()

texts = [
    "Machine learning is a subset of artificial intelligence.",
    "Deep learning uses neural networks with many layers.",
    "Natural language processing deals with text and speech.",
]

response = client.embeddings.create(
    input=texts,
    model="togethercomputer/m2-bert-80M-8k-retrieval",
)

# Access all embeddings (ordered by index)
for item in response.data:
    print(f"Index {item.index}: dimension={len(item.embedding)}")

Cosine Similarity Computation

from together import Together
import numpy as np

client = Together()

query = "How does RAG work?"
documents = [
    "RAG retrieves relevant documents and uses them as context for generation.",
    "The weather in Paris is typically mild in spring.",
    "Retrieval-augmented generation improves LLM accuracy with external knowledge.",
]

# Embed query and documents together
all_texts = [query] + documents
response = client.embeddings.create(
    input=all_texts,
    model="togethercomputer/m2-bert-80M-8k-retrieval",
)

# Extract vectors
vectors = [np.array(item.embedding) for item in response.data]
query_vec = vectors[0]
doc_vecs = vectors[1:]

# Compute cosine similarities
for i, doc_vec in enumerate(doc_vecs):
    similarity = np.dot(query_vec, doc_vec) / (
        np.linalg.norm(query_vec) * np.linalg.norm(doc_vec)
    )
    print(f"Document {i}: similarity={similarity:.4f}")

Async Embedding

import asyncio
from together import AsyncTogether

async def embed_texts():
    client = AsyncTogether()

    response = await client.embeddings.create(
        input=["async embedding example"],
        model="togethercomputer/m2-bert-80M-8k-retrieval",
    )

    print(f"Embedding dimension: {len(response.data[0].embedding)}")

asyncio.run(embed_texts())

Metadata

Property	Value
Implementation	Embeddings Create
API	`Embeddings.create()` / `AsyncEmbeddings.create()`
Source	`src/together/resources/embeddings.py:L19-58` (sync), `L65-104` (async)
HTTP Method	POST
Endpoint	`embeddings`
Domain	NLP, Information_Retrieval, RAG
Workflow	Embeddings_And_Reranking
Principle	Principle:Togethercomputer_Together_python_Embedding_Generation

Knowledge Sources

2026-02-15 16:00 GMT

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment