Implementation:BerriAI Litellm Rerank API

Property	Value
sources	`litellm/rerank_api/main.py`
domains	Reranking, Information Retrieval, Search
last_updated	2026-02-15 16:00 GMT

Overview

The Rerank API module provides a unified interface for reranking document lists by relevance to a query, supporting over 12 LLM providers including Cohere, Together AI, Azure AI, Bedrock, Jina AI, and others.

Description

This module implements the reranking API through a single rerank/arerank function pair decorated with @client. The sync function contains all the core logic: it resolves the provider via litellm.get_llm_provider(), loads the provider-specific BaseRerankConfig, maps optional parameters, and dispatches to the appropriate handler. Most providers use base_llm_http_handler.rerank(), while Together AI and Bedrock have dedicated handler classes (TogetherAIRerank and BedrockRerankHandler). Provider-specific API key and base URL resolution is handled inline with fallback to environment variables. The module supports configurable parameters including top_n, rank_fields, return_documents, max_chunks_per_doc, and max_tokens_per_doc.

Usage

Import this module when you need to reorder a set of documents by their relevance to a search query. It is typically used in RAG (Retrieval-Augmented Generation) pipelines between the retrieval and generation stages.

Code Reference

Source Location

Property	Value
Repository	github.com/BerriAI/litellm
File	`litellm/rerank_api/main.py`
Lines	535
Module	`litellm.rerank_api.main`

Signature

@client
def rerank(
    model: str,
    query: str,
    documents: List[Union[str, Dict[str, Any]]],
    custom_llm_provider: Optional[Literal[
        "cohere", "together_ai", "azure_ai", "infinity",
        "litellm_proxy", "hosted_vllm", "deepinfra",
        "fireworks_ai", "voyage",
    ]] = None,
    top_n: Optional[int] = None,
    rank_fields: Optional[List[str]] = None,
    return_documents: Optional[bool] = True,
    max_chunks_per_doc: Optional[int] = None,
    max_tokens_per_doc: Optional[int] = None,
    **kwargs,
) -> Union[RerankResponse, Coroutine[Any, Any, RerankResponse]]

@client
async def arerank(model, query, documents, ...) -> RerankResponse

Import

from litellm.rerank_api.main import rerank, arerank

I/O Contract

Inputs

Parameter	Type	Required	Description
`model`	`str`	Yes	The reranking model identifier (e.g., "cohere/rerank-english-v3.0")
`query`	`str`	Yes	The search query to rank documents against
`documents`	`List[Union[str, Dict]]`	Yes	Documents to rerank (strings or dicts with text fields)
`custom_llm_provider`	`Optional[str]`	No	Provider name; auto-detected from model if not set
`top_n`	`Optional[int]`	No	Number of top results to return
`rank_fields`	`Optional[List[str]]`	No	Fields to use for ranking when documents are dicts
`return_documents`	`Optional[bool]`	No	Whether to include document text in results (default: True)
`max_chunks_per_doc`	`Optional[int]`	No	Maximum chunks per document
`max_tokens_per_doc`	`Optional[int]`	No	Maximum tokens per document

Outputs

Output	Type	Description
Response	`RerankResponse`	Contains ranked results with scores, indices, and optionally documents

Usage Examples

import litellm

response = litellm.rerank(
    model="cohere/rerank-english-v3.0",
    query="What is machine learning?",
    documents=[
        "Machine learning is a subset of artificial intelligence.",
        "The weather today is sunny.",
        "Deep learning uses neural networks.",
    ],
    top_n=2,
)

for result in response.results:
    print(f"Index: {result.index}, Score: {result.relevance_score}")

import asyncio
import litellm

async def main():
    response = await litellm.arerank(
        model="together_ai/rerank-model",
        query="Python programming",
        documents=["Python is a language", "Java is a language", "Python for data science"],
    )
    print(response)

asyncio.run(main())

Related Pages

BerriAI_Litellm_Responses_API -- Responses API that may use reranking results as context
BerriAI_Litellm_Vector_Stores_API -- Vector store search that can be combined with reranking

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment