Implementation:Mistralai Client python Ocr Process
| Knowledge Sources | |
|---|---|
| Domains | Document_Processing, OCR, Vision |
| Last Updated | 2026-02-15 14:00 GMT |
Overview
Concrete tool for extracting text, tables, and images from documents using Mistral's OCR API provided by the Ocr resource.
Description
The Ocr.process() and Ocr.process_async() methods send a document to Mistral's OCR model and return an OCRResponse with per-page results. The document parameter accepts a DocumentURLChunk (URL) or FileChunk (uploaded file). Optional parameters control which pages to process, whether to include base64 images, image size limits, and table output format.
Usage
Call client.ocr.process() with a document URL or uploaded file. Process the response's pages list for extracted content.
Code Reference
Source Location
- Repository: client-python
- File: src/mistralai/client/ocr.py
- Lines: L19-160 (sync), L162-303 (async)
Signature
class Ocr:
def process(
self,
*,
model: Nullable[str],
document: Document,
id: Optional[str] = None,
pages: Optional[List[int]] = None,
include_image_base64: Optional[bool] = None,
image_limit: Optional[int] = None,
image_min_size: Optional[int] = None,
table_format: Optional[TableFormat] = None,
) -> OCRResponse:
...
async def process_async(
self,
*,
model: Nullable[str],
document: Document,
# Same parameters
) -> OCRResponse:
...
Import
from mistralai import Mistral
from mistralai.models import DocumentURLChunk
# Access via: client.ocr.process(...)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model | str | Yes | OCR model (e.g., "mistral-ocr-latest") |
| document | Document | Yes | DocumentURLChunk (URL) or FileChunk (upload) |
| pages | Optional[List[int]] | No | Specific pages to process (0-indexed) |
| include_image_base64 | Optional[bool] | No | Include base64 images in response |
| image_limit | Optional[int] | No | Maximum number of images to extract |
| table_format | Optional[TableFormat] | No | Table output format |
Outputs
| Name | Type | Description |
|---|---|---|
| response | OCRResponse | Contains pages list and usage info |
| response.pages | List[OCRPageObject] | Per-page results |
| page.markdown | str | Extracted text as markdown |
| page.images | List[OCRImageObject] | Extracted images |
| page.tables | List[OCRTableObject] | Extracted tables |
Usage Examples
OCR from URL
import os
from mistralai import Mistral
from mistralai.models import DocumentURLChunk
client = Mistral(api_key=os.environ["MISTRAL_API_KEY"])
response = client.ocr.process(
model="mistral-ocr-latest",
document=DocumentURLChunk(
document_url="https://arxiv.org/pdf/2201.04234",
),
include_image_base64=True,
)
for page in response.pages:
print(f"Page {page.index}:")
print(page.markdown[:200])
print(f" Images: {len(page.images)}")
print(f" Tables: {len(page.tables)}")
OCR from File
import base64
from pathlib import Path
from mistralai.models import FileChunk
# Read and encode file
pdf_data = Path("document.pdf").read_bytes()
response = client.ocr.process(
model="mistral-ocr-latest",
document=FileChunk(
file=base64.b64encode(pdf_data).decode(),
),
pages=[0, 1, 2], # First 3 pages only
)