Implementation:BerriAI Litellm OCR API

Property	Value
sources	`litellm/ocr/main.py`
domains	OCR, Document Processing, Image Processing
last_updated	2026-02-15 16:00 GMT

Overview

The OCR API module provides a unified interface for optical character recognition, allowing extraction of text and structured content from documents and images through provider-specific OCR models.

Description

This module implements OCR functionality through a ocr/aocr function pair decorated with @client. It accepts documents in Mistral OCR format, supporting two document types: document_url for PDFs and documents, and image_url for images. The module validates the input document format, resolves the provider and model via litellm.get_llm_provider(), loads a BaseOCRConfig for the provider, extracts and maps OCR-specific parameters (such as include_image_base64, pages, image_limit) through the provider config, and delegates the HTTP call to BaseLLMHTTPHandler.ocr(). Base64-encoded document content is also supported for inline document processing.

Usage

Import this module when you need to extract text content from PDFs, documents, or images through an OCR model. It follows the Mistral OCR API format and supports both sync and async operations.

Code Reference

Source Location

Property	Value
Repository	github.com/BerriAI/litellm
File	`litellm/ocr/main.py`
Lines	302
Module	`litellm.ocr.main`

Signature

@client
def ocr(
    model: str,
    document: Dict[str, str],
    api_key: Optional[str] = None,
    api_base: Optional[str] = None,
    timeout: Optional[Union[float, httpx.Timeout]] = None,
    custom_llm_provider: Optional[str] = None,
    extra_headers: Optional[Dict[str, Any]] = None,
    **kwargs,
) -> Union[OCRResponse, Coroutine[Any, Any, OCRResponse]]

@client
async def aocr(
    model: str,
    document: Dict[str, str],
    ...
) -> OCRResponse

Import

from litellm.ocr.main import ocr, aocr

I/O Contract

Inputs

Parameter	Type	Required	Description
`model`	`str`	Yes	The OCR model identifier (e.g., "mistral/mistral-ocr-latest")
`document`	`Dict[str, str]`	Yes	Document specification with `type` and URL field
`api_key`	`Optional[str]`	No	API key for the OCR provider
`api_base`	`Optional[str]`	No	API base URL override
`timeout`	`Optional[Union[float, httpx.Timeout]]`	No	Request timeout
`custom_llm_provider`	`Optional[str]`	No	Provider override; auto-detected from model
`extra_headers`	`Optional[Dict[str, Any]]`	No	Additional HTTP headers
`include_image_base64`	via kwargs	No	Whether to include base64 image data in response
`pages`	via kwargs	No	Specific pages to process

Outputs

Output	Type	Description
Response	`OCRResponse`	Contains extracted pages with markdown content, model info, and usage data

Usage Examples

import litellm

# OCR with a PDF document URL
response = litellm.ocr(
    model="mistral/mistral-ocr-latest",
    document={
        "type": "document_url",
        "document_url": "https://arxiv.org/pdf/2201.04234"
    },
    include_image_base64=True,
)

for page in response.pages:
    print(f"Page {page.index}: {page.markdown[:100]}...")

import asyncio
import litellm

async def main():
    response = await litellm.aocr(
        model="mistral/mistral-ocr-latest",
        document={
            "type": "image_url",
            "image_url": "https://example.com/receipt.png"
        },
    )
    print(response)

asyncio.run(main())

Related Pages

BerriAI_Litellm_Responses_API -- Responses API that can use OCR output as context input
BerriAI_Litellm_Passthrough_API -- Passthrough API for direct provider OCR access

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment