Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Open compass VLMEvalKit OmniDocBench Data Preprocess

From Leeroopedia
Field Value
source VLMEvalKit
domain Vision, Benchmarking, Document Understanding, Data Preprocessing

Overview

Data preprocessing utility module for the OmniDocBench document understanding benchmark in VLMEvalKit.

Description

This utility file does not define a dataset class. It provides data preprocessing functions for the OmniDocBench benchmark, handling document image preparation, text extraction, and format conversion needed before evaluation. It supports the end-to-end document recognition pipeline by transforming raw data into the format expected by the OmniDocBench evaluation system.

Usage

Imported by vlmeval/dataset/OmniDocBench/omnidocbench.py to support data preprocessing in the OmniDocBench dataset evaluation pipeline.

Code Reference

  • Source: vlmeval/dataset/OmniDocBench/data_preprocess.py, Lines: L1-447
  • Import: from vlmeval.dataset.OmniDocBench.data_preprocess import *

I/O Contract

Direction Description
Inputs Raw document images and text data from OmniDocBench TSV files
Outputs Preprocessed data structures ready for evaluation

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment