Implementation:Togethercomputer Together python Embeddings Create
Appearance
Overview
Embeddings Create implements the Principle:Togethercomputer_Together_python_Embedding_Generation principle by providing the Embeddings.create() method for generating dense vector embeddings from text using the Together API.
API
Sync: Embeddings.create(*, input, model, **kwargs) -> EmbeddingResponse
Async: AsyncEmbeddings.create(*, input, model, **kwargs) -> EmbeddingResponse
Source
- Sync implementation:
src/together/resources/embeddings.py:L19-58 - Async implementation:
src/together/resources/embeddings.py:L65-104 - Request type:
src/together/types/embeddings.py:L11-15 - Response types:
src/together/types/embeddings.py:L18-35
Import
from together import Together
client = Together()
response = client.embeddings.create(
input="Hello world",
model="togethercomputer/m2-bert-80M-8k-retrieval",
)
Key Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
input |
List[str] | (required) | A string or list of strings to embed |
model |
str |
(required) | The name of the embedding model to use |
Inputs and Outputs
Input Type: EmbeddingRequest
class EmbeddingRequest(BaseModel):
# input text or list of input texts
input: str | List[str]
# model to query
model: str
Output Type: EmbeddingResponse
class EmbeddingResponse(BaseModel):
# job id
id: str | None = None
# query model
model: str | None = None
# object type (always "list")
object: Literal["list"] | None = None
# list of embedding choices
data: List[EmbeddingChoicesData] | None = None
Embedding Choice: EmbeddingChoicesData
class EmbeddingChoicesData(BaseModel):
# response index (position in input list)
index: int
# object type (ObjectType.Embedding)
object: ObjectType
# embedding vector as list of floats
embedding: List[float] | None = None
Internal Flow
The create() method follows this sequence:
- Constructs an
EmbeddingRequestfrom the providedinputandmodelparameters - Serializes the request using
.model_dump(exclude_none=True)to produce the API payload - Creates an
APIRequestorwith the client configuration - Sends a POST request to the
embeddingsendpoint viarequestor.request()(sync) orrequestor.arequest()(async) - Deserializes the raw
TogetherResponseinto anEmbeddingResponseobject
Usage Examples
Single Text Embedding
from together import Together
client = Together()
response = client.embeddings.create(
input="What is retrieval-augmented generation?",
model="togethercomputer/m2-bert-80M-8k-retrieval",
)
# Access the embedding vector
embedding = response.data[0].embedding
print(f"Embedding dimension: {len(embedding)}")
print(f"First 5 values: {embedding[:5]}")
Batch Embedding
from together import Together
client = Together()
texts = [
"Machine learning is a subset of artificial intelligence.",
"Deep learning uses neural networks with many layers.",
"Natural language processing deals with text and speech.",
]
response = client.embeddings.create(
input=texts,
model="togethercomputer/m2-bert-80M-8k-retrieval",
)
# Access all embeddings (ordered by index)
for item in response.data:
print(f"Index {item.index}: dimension={len(item.embedding)}")
Cosine Similarity Computation
from together import Together
import numpy as np
client = Together()
query = "How does RAG work?"
documents = [
"RAG retrieves relevant documents and uses them as context for generation.",
"The weather in Paris is typically mild in spring.",
"Retrieval-augmented generation improves LLM accuracy with external knowledge.",
]
# Embed query and documents together
all_texts = [query] + documents
response = client.embeddings.create(
input=all_texts,
model="togethercomputer/m2-bert-80M-8k-retrieval",
)
# Extract vectors
vectors = [np.array(item.embedding) for item in response.data]
query_vec = vectors[0]
doc_vecs = vectors[1:]
# Compute cosine similarities
for i, doc_vec in enumerate(doc_vecs):
similarity = np.dot(query_vec, doc_vec) / (
np.linalg.norm(query_vec) * np.linalg.norm(doc_vec)
)
print(f"Document {i}: similarity={similarity:.4f}")
Async Embedding
import asyncio
from together import AsyncTogether
async def embed_texts():
client = AsyncTogether()
response = await client.embeddings.create(
input=["async embedding example"],
model="togethercomputer/m2-bert-80M-8k-retrieval",
)
print(f"Embedding dimension: {len(response.data[0].embedding)}")
asyncio.run(embed_texts())
Metadata
| Property | Value |
|---|---|
| Implementation | Embeddings Create |
| API | Embeddings.create() / AsyncEmbeddings.create()
|
| Source | src/together/resources/embeddings.py:L19-58 (sync), L65-104 (async)
|
| HTTP Method | POST |
| Endpoint | embeddings
|
| Domain | NLP, Information_Retrieval, RAG |
| Workflow | Embeddings_And_Reranking |
| Principle | Principle:Togethercomputer_Together_python_Embedding_Generation |
Knowledge Sources
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment