Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Open compass VLMEvalKit Result Transfer

From Leeroopedia
Field Value
source VLMEvalKit
domain Vision, Utility

Overview

Utility functions for generating benchmark submission files in required formats.

Description

This module provides functions to transform VLMEvalKit evaluation results into submission-ready formats for specific benchmarks. The MMMU_result_transfer function processes MMMU results by extracting multiple-choice answers using inference matching and exporting a JSON mapping of IDs to predictions. The MMTBench_result_transfer function handles MMT-Bench results by optionally using a GPT-based judge model for answer extraction, processing results in parallel with progress tracking, and producing a TSV submission file.

Usage

Called internally by VLMEvalKit to convert evaluation output files into benchmark-specific submission formats.

Code Reference

  • Source: vlmeval/utils/result_transfer.py, Lines: L1-97
  • Import: from vlmeval.utils.result_transfer import MMMU_result_transfer, MMTBench_result_transfer

Key Functions:

def MMMU_result_transfer(result_path): ...
def MMTBench_result_transfer(eval_file, dataset='default', **judge_kwargs): ...

I/O Contract

Direction Description
Inputs Evaluation result file paths (.xlsx for MMMU, various formats for MMT-Bench) with prediction data
Outputs Submission-ready files: JSON for MMMU, TSV for MMT-Bench

Usage Examples

# Internal usage
from vlmeval.utils.result_transfer import MMMU_result_transfer, MMTBench_result_transfer
json_path = MMMU_result_transfer('results/mmmu_eval.xlsx')
tsv_path = MMTBench_result_transfer('results/mmtbench_eval.xlsx', model='chatgpt-0125')

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment