| Property |
Value
|
| sources |
litellm/rerank_api/main.py
|
| domains |
Reranking, Information Retrieval, Search
|
| last_updated |
2026-02-15 16:00 GMT
|
Overview
The Rerank API module provides a unified interface for reranking document lists by relevance to a query, supporting over 12 LLM providers including Cohere, Together AI, Azure AI, Bedrock, Jina AI, and others.
Description
This module implements the reranking API through a single rerank/arerank function pair decorated with @client. The sync function contains all the core logic: it resolves the provider via litellm.get_llm_provider(), loads the provider-specific BaseRerankConfig, maps optional parameters, and dispatches to the appropriate handler. Most providers use base_llm_http_handler.rerank(), while Together AI and Bedrock have dedicated handler classes (TogetherAIRerank and BedrockRerankHandler). Provider-specific API key and base URL resolution is handled inline with fallback to environment variables. The module supports configurable parameters including top_n, rank_fields, return_documents, max_chunks_per_doc, and max_tokens_per_doc.
Usage
Import this module when you need to reorder a set of documents by their relevance to a search query. It is typically used in RAG (Retrieval-Augmented Generation) pipelines between the retrieval and generation stages.
Code Reference
Source Location
Signature
@client
def rerank(
model: str,
query: str,
documents: List[Union[str, Dict[str, Any]]],
custom_llm_provider: Optional[Literal[
"cohere", "together_ai", "azure_ai", "infinity",
"litellm_proxy", "hosted_vllm", "deepinfra",
"fireworks_ai", "voyage",
]] = None,
top_n: Optional[int] = None,
rank_fields: Optional[List[str]] = None,
return_documents: Optional[bool] = True,
max_chunks_per_doc: Optional[int] = None,
max_tokens_per_doc: Optional[int] = None,
**kwargs,
) -> Union[RerankResponse, Coroutine[Any, Any, RerankResponse]]
@client
async def arerank(model, query, documents, ...) -> RerankResponse
Import
from litellm.rerank_api.main import rerank, arerank
I/O Contract
Inputs
| Parameter |
Type |
Required |
Description
|
model |
str |
Yes |
The reranking model identifier (e.g., "cohere/rerank-english-v3.0")
|
query |
str |
Yes |
The search query to rank documents against
|
documents |
List[Union[str, Dict]] |
Yes |
Documents to rerank (strings or dicts with text fields)
|
custom_llm_provider |
Optional[str] |
No |
Provider name; auto-detected from model if not set
|
top_n |
Optional[int] |
No |
Number of top results to return
|
rank_fields |
Optional[List[str]] |
No |
Fields to use for ranking when documents are dicts
|
return_documents |
Optional[bool] |
No |
Whether to include document text in results (default: True)
|
max_chunks_per_doc |
Optional[int] |
No |
Maximum chunks per document
|
max_tokens_per_doc |
Optional[int] |
No |
Maximum tokens per document
|
Outputs
| Output |
Type |
Description
|
| Response |
RerankResponse |
Contains ranked results with scores, indices, and optionally documents
|
Usage Examples
import litellm
response = litellm.rerank(
model="cohere/rerank-english-v3.0",
query="What is machine learning?",
documents=[
"Machine learning is a subset of artificial intelligence.",
"The weather today is sunny.",
"Deep learning uses neural networks.",
],
top_n=2,
)
for result in response.results:
print(f"Index: {result.index}, Score: {result.relevance_score}")
import asyncio
import litellm
async def main():
response = await litellm.arerank(
model="together_ai/rerank-model",
query="Python programming",
documents=["Python is a language", "Java is a language", "Python for data science"],
)
print(response)
asyncio.run(main())
Related Pages