Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Mistralai Client python Ocr Process

From Leeroopedia
Knowledge Sources
Domains Document_Processing, OCR, Vision
Last Updated 2026-02-15 14:00 GMT

Overview

Concrete tool for extracting text, tables, and images from documents using Mistral's OCR API provided by the Ocr resource.

Description

The Ocr.process() and Ocr.process_async() methods send a document to Mistral's OCR model and return an OCRResponse with per-page results. The document parameter accepts a DocumentURLChunk (URL) or FileChunk (uploaded file). Optional parameters control which pages to process, whether to include base64 images, image size limits, and table output format.

Usage

Call client.ocr.process() with a document URL or uploaded file. Process the response's pages list for extracted content.

Code Reference

Source Location

  • Repository: client-python
  • File: src/mistralai/client/ocr.py
  • Lines: L19-160 (sync), L162-303 (async)

Signature

class Ocr:
    def process(
        self,
        *,
        model: Nullable[str],
        document: Document,
        id: Optional[str] = None,
        pages: Optional[List[int]] = None,
        include_image_base64: Optional[bool] = None,
        image_limit: Optional[int] = None,
        image_min_size: Optional[int] = None,
        table_format: Optional[TableFormat] = None,
    ) -> OCRResponse:
        ...

    async def process_async(
        self,
        *,
        model: Nullable[str],
        document: Document,
        # Same parameters
    ) -> OCRResponse:
        ...

Import

from mistralai import Mistral
from mistralai.models import DocumentURLChunk
# Access via: client.ocr.process(...)

I/O Contract

Inputs

Name Type Required Description
model str Yes OCR model (e.g., "mistral-ocr-latest")
document Document Yes DocumentURLChunk (URL) or FileChunk (upload)
pages Optional[List[int]] No Specific pages to process (0-indexed)
include_image_base64 Optional[bool] No Include base64 images in response
image_limit Optional[int] No Maximum number of images to extract
table_format Optional[TableFormat] No Table output format

Outputs

Name Type Description
response OCRResponse Contains pages list and usage info
response.pages List[OCRPageObject] Per-page results
page.markdown str Extracted text as markdown
page.images List[OCRImageObject] Extracted images
page.tables List[OCRTableObject] Extracted tables

Usage Examples

OCR from URL

import os
from mistralai import Mistral
from mistralai.models import DocumentURLChunk

client = Mistral(api_key=os.environ["MISTRAL_API_KEY"])

response = client.ocr.process(
    model="mistral-ocr-latest",
    document=DocumentURLChunk(
        document_url="https://arxiv.org/pdf/2201.04234",
    ),
    include_image_base64=True,
)

for page in response.pages:
    print(f"Page {page.index}:")
    print(page.markdown[:200])
    print(f"  Images: {len(page.images)}")
    print(f"  Tables: {len(page.tables)}")

OCR from File

import base64
from pathlib import Path
from mistralai.models import FileChunk

# Read and encode file
pdf_data = Path("document.pdf").read_bytes()

response = client.ocr.process(
    model="mistral-ocr-latest",
    document=FileChunk(
        file=base64.b64encode(pdf_data).decode(),
    ),
    pages=[0, 1, 2],  # First 3 pages only
)

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment