Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Open compass VLMEvalKit MEGABench Analysis Utils

From Leeroopedia
Field Value
source VLMEvalKit
domain Vision, Evaluation, Benchmark Analysis, Data Loading

Overview

Provides data loading, caching, and analysis utilities for the MEGA-Bench evaluation framework including HuggingFace dataset integration.

Description

This module implements `_load_hf` for loading MEGA-Bench datasets from HuggingFace with caching, `_get_scoring_functions` for extracting metric configurations from task metadata, and `_determine_eval_style` for classifying tasks as rule-based or LLM-based evaluation. It uses module-level caches (`_DATASET_CACHE`, `_SCORING_FUNCTIONS_CACHE`) for efficient repeated access to dataset and scoring function information across core and open task subsets.

Usage

Called internally by the corresponding dataset class during evaluation.

Code Reference

  • Source: vlmeval/dataset/utils/megabench/tools/analysis_utils.py, Lines: L1-182
  • Import: from vlmeval.dataset.utils.megabench.tools.analysis_utils import _load_hf, _get_scoring_functions

Key Functions:

def _load_hf(subset_name: str) -> List[Dict[str, Any]]: ...
def _get_scoring_functions(): ...
def _determine_eval_style(task): ...

I/O Contract

Direction Description
Inputs HuggingFace dataset subset names ("core", "open"); task metadata dictionaries
Outputs Task dictionaries keyed by task name; scoring function configurations; evaluation style strings ("rule" or "llm")

Usage Examples

from vlmeval.dataset.utils.megabench.tools.analysis_utils import _load_hf

task_dict = _load_hf("core")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment