Implementation:Open compass VLMEvalKit OmniDocBench Data Preprocess
| Field | Value |
|---|---|
| source | VLMEvalKit |
| domain | Vision, Benchmarking, Document Understanding, Data Preprocessing |
Overview
Data preprocessing utility module for the OmniDocBench document understanding benchmark in VLMEvalKit.
Description
This utility file does not define a dataset class. It provides data preprocessing functions for the OmniDocBench benchmark, handling document image preparation, text extraction, and format conversion needed before evaluation. It supports the end-to-end document recognition pipeline by transforming raw data into the format expected by the OmniDocBench evaluation system.
Usage
Imported by vlmeval/dataset/OmniDocBench/omnidocbench.py to support data preprocessing in the OmniDocBench dataset evaluation pipeline.
Code Reference
- Source:
vlmeval/dataset/OmniDocBench/data_preprocess.py, Lines: L1-447 - Import:
from vlmeval.dataset.OmniDocBench.data_preprocess import *
I/O Contract
| Direction | Description |
|---|---|
| Inputs | Raw document images and text data from OmniDocBench TSV files |
| Outputs | Preprocessed data structures ready for evaluation |