Implementation:Togethercomputer Together python Rerank Create
Overview
Rerank Create implements the Principle:Togethercomputer_Together_python_Document_Reranking principle by providing the Rerank.create() method for reordering candidate documents by relevance to a query using a cross-encoder reranking model via the Together API.
API
Sync: Rerank.create(*, model, query, documents, top_n=None, return_documents=False, rank_fields=None, **kwargs) -> RerankResponse
Async: AsyncRerank.create(*, model, query, documents, top_n=None, return_documents=False, rank_fields=None, **kwargs) -> RerankResponse
Source
- Sync implementation:
src/together/resources/rerank.py:L19-70 - Async implementation:
src/together/resources/rerank.py:L77-128 - Request type:
src/together/types/rerank.py:L9-21 - Response types:
src/together/types/rerank.py:L24-43
Import
from together import Together
client = Together()
response = client.rerank.create(
model="Salesforce/Llama-Rank-V1",
query="What is machine learning?",
documents=["ML is a subset of AI.", "The weather is nice today."],
)
Key Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
model |
str |
(required) | The name of the reranking model to use |
query |
str |
(required) | The query string to rank documents against |
documents |
List[Dict[str, Any]] | (required) | List of documents to rerank (plain strings or structured dicts) |
top_n |
None | None |
Number of top results to return (None returns all) |
return_documents |
bool |
False |
Whether to include document text in the response |
rank_fields |
None | None |
Fields to use for ranking when documents are dicts |
Inputs and Outputs
Input Type: RerankRequest
class RerankRequest(BaseModel):
# model to query
model: str
# input query string
query: str
# list of documents (strings or dicts)
documents: List[str] | List[Dict[str, Any]]
# return top_n results
top_n: int | None = None
# boolean to return documents in response
return_documents: bool = False
# field selector for dict documents
rank_fields: List[str] | None = None
Output Type: RerankResponse
class RerankResponse(BaseModel):
# job id
id: str | None = None
# object type (always "rerank")
object: Literal["rerank"] | None = None
# query model
model: str | None = None
# list of reranked results (sorted by relevance_score descending)
results: List[RerankChoicesData] | None = None
# usage statistics
usage: UsageData | None = None
Rerank Choice: RerankChoicesData
class RerankChoicesData(BaseModel):
# original index of the document in the input list
index: int
# relevance score (higher is more relevant)
relevance_score: float
# document content (only present if return_documents=True)
document: Dict[str, Any] | None = None
Usage Data: UsageData
class UsageData(BaseModel):
prompt_tokens: int
completion_tokens: int
total_tokens: int
Internal Flow
The create() method follows this sequence:
- Constructs a
RerankRequestfrom the provided parameters (model,query,documents,top_n,return_documents,rank_fields) - Serializes the request using
.model_dump(exclude_none=True)to produce the API payload - Creates an
APIRequestorwith the client configuration - Sends a POST request to the
rerankendpoint viarequestor.request()(sync) orrequestor.arequest()(async) - Deserializes the raw
TogetherResponseinto aRerankResponseobject
Usage Examples
Basic Reranking with String Documents
from together import Together
client = Together()
query = "What is retrieval-augmented generation?"
documents = [
"The weather forecast predicts rain tomorrow.",
"RAG combines retrieval with generation for more accurate LLM responses.",
"Python is a popular programming language.",
"Retrieval-augmented generation uses external knowledge to improve LLM outputs.",
"The stock market closed higher today.",
]
response = client.rerank.create(
model="Salesforce/Llama-Rank-V1",
query=query,
documents=documents,
top_n=3,
)
for result in response.results:
print(f"Index: {result.index}, Score: {result.relevance_score:.4f}")
print(f" Document: {documents[result.index]}")
Reranking with Structured Documents
from together import Together
client = Together()
query = "deep learning frameworks"
documents = [
{"title": "PyTorch Guide", "body": "PyTorch is an open-source deep learning framework."},
{"title": "Cooking Recipes", "body": "Learn to make delicious pasta at home."},
{"title": "TensorFlow Tutorial", "body": "TensorFlow provides tools for building neural networks."},
]
response = client.rerank.create(
model="Salesforce/Llama-Rank-V1",
query=query,
documents=documents,
rank_fields=["title", "body"],
return_documents=True,
top_n=2,
)
for result in response.results:
print(f"Score: {result.relevance_score:.4f}")
print(f" Document: {result.document}")
Retrieve-Then-Rerank Pipeline
from together import Together
import numpy as np
client = Together()
# Step 1: Embed query and corpus
query = "How does attention work in transformers?"
corpus = [
"Attention mechanisms allow models to focus on relevant parts of the input.",
"Transformers use self-attention to process sequences in parallel.",
"Convolutional neural networks use filters for feature extraction.",
"The attention mechanism computes weighted sums of value vectors.",
"Recurrent neural networks process sequences one step at a time.",
]
# Embed everything
all_texts = [query] + corpus
embed_response = client.embeddings.create(
input=all_texts,
model="togethercomputer/m2-bert-80M-8k-retrieval",
)
vectors = [np.array(item.embedding) for item in embed_response.data]
query_vec = vectors[0]
doc_vecs = vectors[1:]
# Step 2: Retrieve top candidates by cosine similarity
similarities = [
np.dot(query_vec, dv) / (np.linalg.norm(query_vec) * np.linalg.norm(dv))
for dv in doc_vecs
]
top_indices = np.argsort(similarities)[::-1][:3]
candidates = [corpus[i] for i in top_indices]
# Step 3: Rerank candidates
rerank_response = client.rerank.create(
model="Salesforce/Llama-Rank-V1",
query=query,
documents=candidates,
top_n=2,
)
for result in rerank_response.results:
print(f"Score: {result.relevance_score:.4f} -> {candidates[result.index]}")
Async Reranking
import asyncio
from together import AsyncTogether
async def rerank_documents():
client = AsyncTogether()
response = await client.rerank.create(
model="Salesforce/Llama-Rank-V1",
query="machine learning optimization",
documents=[
"Gradient descent is used to minimize loss functions.",
"Tropical fish need warm water to survive.",
"Adam optimizer adapts learning rates for each parameter.",
],
top_n=2,
)
for result in response.results:
print(f"Index: {result.index}, Score: {result.relevance_score:.4f}")
asyncio.run(rerank_documents())
Metadata
| Property | Value |
|---|---|
| Implementation | Rerank Create |
| API | Rerank.create() / AsyncRerank.create()
|
| Source | src/together/resources/rerank.py:L19-70 (sync), L77-128 (async)
|
| HTTP Method | POST |
| Endpoint | rerank
|
| Domain | NLP, Information_Retrieval, RAG |
| Workflow | Embeddings_And_Reranking |
| Principle | Principle:Togethercomputer_Together_python_Document_Reranking |