Implementation:BerriAI Litellm Rerank Types

Attribute	Value
Sources	litellm/types/rerank.py
Domains	Reranking, Search, Information Retrieval, Cohere API
Last Updated	2026-02-15 16:00 GMT

Overview

Pydantic models and TypedDicts defining the request and response schema for LiteLLM's reranking API, following the Cohere rerank API format.

Description

This module implements the type system for LiteLLM's reranking functionality. LiteLLM follows the Cohere rerank API format (see Cohere Rerank Reference) as the canonical interface, translating to other providers as needed. Key types include:

RerankRequest -- Pydantic model for rerank API requests (model, query, documents, top_n, rank_fields, etc.).
OptionalRerankParams -- TypedDict capturing optional rerank parameters, useful for parameter forwarding.
Response metadata -- RerankBilledUnits (search_units, total_tokens), RerankTokens (input/output tokens), RerankResponseMeta (api_version, billed_units, tokens).
Result types -- RerankResponseDocument (text content), RerankResponseResult (index, relevance_score, optional document).
RerankResponse -- Pydantic model for the full rerank response with results, metadata, and LiteLLM's private _hidden_params for cost tracking.

Usage

Import from this module when:

Making rerank API calls through litellm.rerank().
Processing rerank responses for downstream use (e.g., RAG pipelines).
Implementing custom reranking logic or provider transformations.
Working with the RAG query pipeline that includes a reranking step.

Code Reference

Source Location

litellm/types/rerank.py (77 lines)

Key Types

Type Name	Kind	Description
`RerankRequest`	BaseModel	Rerank API request with model, query, documents, and optional parameters
`OptionalRerankParams`	TypedDict	Optional rerank parameters for forwarding
`RerankBilledUnits`	TypedDict	Billing info: search_units, total_tokens
`RerankTokens`	TypedDict	Token usage: input_tokens, output_tokens
`RerankResponseMeta`	TypedDict	Response metadata: api_version, billed_units, tokens
`RerankResponseDocument`	TypedDict	Document content in rerank results (text field)
`RerankResponseResult`	TypedDict	Single result: index, relevance_score, optional document
`RerankResponse`	BaseModel	Full response with results list, metadata, and _hidden_params

Signature: RerankRequest

class RerankRequest(BaseModel):
    model: str
    query: str
    top_n: Optional[int] = None
    documents: List[Union[str, dict]]
    rank_fields: Optional[List[str]] = None
    return_documents: Optional[bool] = None
    max_chunks_per_doc: Optional[int] = None
    max_tokens_per_doc: Optional[int] = None

Signature: RerankResponse

class RerankResponse(BaseModel):
    id: Optional[str] = None
    results: Optional[List[RerankResponseResult]] = None
    meta: Optional[RerankResponseMeta] = None
    _hidden_params: dict = PrivateAttr(default_factory=dict)

Import

from litellm.types.rerank import (
    RerankRequest,
    RerankResponse,
    RerankResponseResult,
    RerankResponseDocument,
    RerankResponseMeta,
    RerankBilledUnits,
    RerankTokens,
    OptionalRerankParams,
)

I/O Contract

Inputs (RerankRequest)

Field	Type	Default	Description
`model`	`str`	(required)	Reranking model identifier (e.g., "cohere/rerank-english-v3.0")
`query`	`str`	(required)	The search query to rank documents against
`documents`	`List[Union[str, dict]]`	(required)	Documents to rerank (strings or dicts with text fields)
`top_n`	`Optional[int]`	None	Number of top results to return
`rank_fields`	`Optional[List[str]]`	None	Fields to use for ranking when documents are dicts
`return_documents`	`Optional[bool]`	None	Whether to return document text in results
`max_chunks_per_doc`	`Optional[int]`	None	Maximum chunks per document
`max_tokens_per_doc`	`Optional[int]`	None	Maximum tokens per document

Outputs (RerankResponse)

Field	Type	Description
`id`	`Optional[str]`	Response identifier
`results`	`Optional[List[RerankResponseResult]]`	Ranked results with index and relevance_score
`meta`	`Optional[RerankResponseMeta]`	Metadata including api_version, billed_units, and token counts
`_hidden_params`	`dict`	Private LiteLLM metadata for cost tracking and logging

RerankResponseResult

Field	Type	Description
`index`	`int`	Index of the document in the original input list (required)
`relevance_score`	`float`	Relevance score of the document to the query (required)
`document`	`RerankResponseDocument`	The document text (optional, present if return_documents=True)

Usage Examples

Basic rerank request

from litellm.types.rerank import RerankRequest

request = RerankRequest(
    model="cohere/rerank-english-v3.0",
    query="What is machine learning?",
    documents=[
        "Machine learning is a subset of AI.",
        "Python is a programming language.",
        "Deep learning uses neural networks.",
    ],
    top_n=2,
    return_documents=True,
)

Processing a rerank response

from litellm.types.rerank import RerankResponse

response = RerankResponse(
    id="rerank-001",
    results=[
        {"index": 0, "relevance_score": 0.95, "document": {"text": "Machine learning is a subset of AI."}},
        {"index": 2, "relevance_score": 0.82, "document": {"text": "Deep learning uses neural networks."}},
    ],
    meta={
        "billed_units": {"search_units": 1},
        "tokens": {"input_tokens": 50, "output_tokens": 0},
    },
)

# Access results
for result in response.results:
    print(f"Index: {result['index']}, Score: {result['relevance_score']}")

Reranking with dict documents

from litellm.types.rerank import RerankRequest

request = RerankRequest(
    model="cohere/rerank-english-v3.0",
    query="climate change effects",
    documents=[
        {"title": "Global Warming", "text": "Rising temperatures affect ecosystems."},
        {"title": "Solar Energy", "text": "Solar panels convert sunlight to electricity."},
    ],
    rank_fields=["text"],
    top_n=1,
)

Related Pages

RAG Types -- RAG query pipeline uses RerankConfig to integrate reranking after retrieval.
Embedding Types -- Embedding types used alongside reranking in retrieval pipelines.
Vector Store Types -- Vector store search results that feed into the reranking step.

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment