Implementation:Open compass VLMEvalKit ScreenSpot Pro
| Field | Value |
|---|---|
| source | VLMEvalKit |
| domain | Vision, Benchmarking, GUI Grounding, Professional Software |
Overview
Benchmark dataset implementation for ScreenSpot Pro professional software GUI grounding evaluation in VLMEvalKit.
Description
ScreenSpot_Pro inherits from ImageBaseDataset and implements the ScreenSpot Pro benchmark for evaluating GUI agent capabilities on professional software interfaces. The TYPE field is set to 'GUI'. It covers six professional domains: Development, Creative, CAD, Scientific, Office, and OS, with point-based evaluation and functional referring expressions.
Usage
Registered in vlmeval/dataset/__init__.py and invoked through build_dataset() by benchmark name.
Code Reference
- Source:
vlmeval/dataset/GUI/screenspot_pro.py, Lines: L1-460 - Import:
from vlmeval.dataset.GUI.screenspot_pro import ScreenSpot_Pro
Signature:
class ScreenSpot_Pro(ImageBaseDataset):
MODALITY = "IMAGE"
TYPE = "GUI"
DATASET_URL = {...}
DATASET_MD5 = {...}
EVAL_TYPE = "point"
RE_TYPE = "functional"
...
I/O Contract
| Direction | Description |
|---|---|
| Inputs | TSV dataset file with professional software screenshots and element grounding tasks |
| Outputs | Evaluation results DataFrame with accuracy scores per professional domain |
Usage Examples
from vlmeval.dataset import build_dataset
dataset = build_dataset('ScreenSpot_Pro_Development')