Implementation:Cohere ai Cohere python V2Client Rerank
| Field | Value |
|---|---|
| Type | Implementation |
| Source | Cohere Python SDK |
| Domain | NLP Information Retrieval Reranking |
| Last Updated | 2026-02-15 |
| Implements | Principle:Cohere_ai_Cohere_python_Semantic_Reranking |
Overview
Concrete method for reranking documents by semantic relevance using Cohere's cross-encoder models.
Description
V2Client.rerank takes a query string and list of document strings, sends them to the Cohere rerank API, and returns a V2RerankResponse with documents ordered by relevance score. Supports top_n filtering and max_tokens_per_doc truncation. Long documents are automatically truncated to max_tokens_per_doc (default 4096).
Code Reference
src/cohere/v2/client.py Lines L492-569
Signature
def rerank(
self, *, model: str, query: str, documents: typing.Sequence[str],
top_n: typing.Optional[int] = OMIT, max_tokens_per_doc: typing.Optional[int] = OMIT,
priority: typing.Optional[int] = OMIT, request_options: typing.Optional[RequestOptions] = None,
) -> V2RerankResponse:
Import
from cohere import ClientV2 (access via client.rerank())
Inputs
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | str | Yes | The rerank model to use |
| query | str | Yes | The query string to rank against |
| documents | Sequence[str] | Yes | List of candidate document strings |
| top_n | Optional[int] | No | Number of top results to return |
| max_tokens_per_doc | Optional[int] | No | Max tokens per document (default 4096) |
Outputs
V2RerankResponse with results list containing index, relevance_score (0-1).
Example
from cohere import ClientV2
client = ClientV2()
response = client.rerank(
model="rerank-v4.0-pro",
query="What is the capital of the United States?",
documents=["Washington D.C. is the capital.", "Paris is in France.", "London is in England."],
top_n=2,
)
for result in response.results:
print(f"Document {result.index}: score={result.relevance_score:.4f}")