Implementation:Mistralai Client python OCRResponse Model
| Knowledge Sources | |
|---|---|
| Domains | Document_Processing, OCR |
| Last Updated | 2026-02-15 14:00 GMT |
Overview
Concrete tool for extracting structured document content from OCR API responses provided by the OCRResponse and OCRPageObject models.
Description
OCRResponse contains a pages list of OCRPageObject models. Each page has: markdown (extracted text), images (list of OCRImageObject with id, base64 data, and bounding box), tables (list of OCRTableObject), dimensions (OCRPageDimensions with width/height), and index (page number). The top-level response also has usage_info (OCRUsageInfo) and optional document_annotation.
Usage
Access the OCRResponse from client.ocr.process(). Iterate over response.pages and access .markdown, .images, and .tables for each page.
Code Reference
Source Location
- Repository: client-python
- File: src/mistralai/client/models/ocrresponse.py (L1-69), ocrpageobject.py (L1-91)
Signature
class OCRPageObject(BaseModel):
index: int
markdown: str
images: List[OCRImageObject]
tables: List[OCRTableObject]
dimensions: OCRPageDimensions
class OCRResponse(BaseModel):
pages: List[OCRPageObject]
model: str
usage_info: OCRUsageInfo
document_annotation: Optional[str] = None
Import
from mistralai.models import OCRResponse
# Typically received as return value from client.ocr.process()
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| response | OCRResponse | Yes | Return value from ocr.process() |
Outputs
| Name | Type | Description |
|---|---|---|
| pages | List[OCRPageObject] | Per-page extraction results |
| page.markdown | str | Full page text as markdown |
| page.images | List[OCRImageObject] | Extracted images |
| page.tables | List[OCRTableObject] | Extracted tables |
| usage_info | OCRUsageInfo | Token consumption |
Usage Examples
Process OCR Results
response = client.ocr.process(
model="mistral-ocr-latest",
document=DocumentURLChunk(document_url="https://example.com/doc.pdf"),
include_image_base64=True,
)
# Process each page
for page in response.pages:
print(f"\n--- Page {page.index} ({page.dimensions.width}x{page.dimensions.height}) ---")
# Extract text
print(page.markdown)
# Process images
for img in page.images:
print(f" Image: {img.id} ({img.top_left_x},{img.top_left_y})")
if img.image_base64:
# Save or process base64 image data
pass
# Process tables
for table in page.tables:
print(f" Table: {table.markdown}")
# Usage info
print(f"\nTokens: {response.usage_info.pages_processed} pages processed")