Implementation:Mistralai Client python Ocr Process

Knowledge Sources	Mistral Client Python Mistral AI OCR
Domains	Document_Processing, OCR, Vision
Last Updated	2026-02-15 14:00 GMT

Overview

Concrete tool for extracting text, tables, and images from documents using Mistral's OCR API provided by the Ocr resource.

Description

The Ocr.process() and Ocr.process_async() methods send a document to Mistral's OCR model and return an OCRResponse with per-page results. The document parameter accepts a DocumentURLChunk (URL) or FileChunk (uploaded file). Optional parameters control which pages to process, whether to include base64 images, image size limits, and table output format.

Usage

Call client.ocr.process() with a document URL or uploaded file. Process the response's pages list for extracted content.

Code Reference

Source Location

Repository: client-python
File: src/mistralai/client/ocr.py
Lines: L19-160 (sync), L162-303 (async)

Signature

class Ocr:
    def process(
        self,
        *,
        model: Nullable[str],
        document: Document,
        id: Optional[str] = None,
        pages: Optional[List[int]] = None,
        include_image_base64: Optional[bool] = None,
        image_limit: Optional[int] = None,
        image_min_size: Optional[int] = None,
        table_format: Optional[TableFormat] = None,
    ) -> OCRResponse:
        ...

    async def process_async(
        self,
        *,
        model: Nullable[str],
        document: Document,
        # Same parameters
    ) -> OCRResponse:
        ...

Import

from mistralai import Mistral
from mistralai.models import DocumentURLChunk
# Access via: client.ocr.process(...)

I/O Contract

Inputs

Name	Type	Required	Description
model	str	Yes	OCR model (e.g., "mistral-ocr-latest")
document	Document	Yes	DocumentURLChunk (URL) or FileChunk (upload)
pages	Optional[List[int]]	No	Specific pages to process (0-indexed)
include_image_base64	Optional[bool]	No	Include base64 images in response
image_limit	Optional[int]	No	Maximum number of images to extract
table_format	Optional[TableFormat]	No	Table output format

Outputs

Name	Type	Description
response	OCRResponse	Contains pages list and usage info
response.pages	List[OCRPageObject]	Per-page results
page.markdown	str	Extracted text as markdown
page.images	List[OCRImageObject]	Extracted images
page.tables	List[OCRTableObject]	Extracted tables

Usage Examples

OCR from URL

import os
from mistralai import Mistral
from mistralai.models import DocumentURLChunk

client = Mistral(api_key=os.environ["MISTRAL_API_KEY"])

response = client.ocr.process(
    model="mistral-ocr-latest",
    document=DocumentURLChunk(
        document_url="https://arxiv.org/pdf/2201.04234",
    ),
    include_image_base64=True,
)

for page in response.pages:
    print(f"Page {page.index}:")
    print(page.markdown[:200])
    print(f"  Images: {len(page.images)}")
    print(f"  Tables: {len(page.tables)}")

OCR from File

import base64
from pathlib import Path
from mistralai.models import FileChunk

# Read and encode file
pdf_data = Path("document.pdf").read_bytes()

response = client.ocr.process(
    model="mistral-ocr-latest",
    document=FileChunk(
        file=base64.b64encode(pdf_data).decode(),
    ),
    pages=[0, 1, 2],  # First 3 pages only
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment