Implementation:Open compass VLMEvalKit MEGABench Analysis Utils
| Field | Value |
|---|---|
| source | VLMEvalKit |
| domain | Vision, Evaluation, Benchmark Analysis, Data Loading |
Overview
Provides data loading, caching, and analysis utilities for the MEGA-Bench evaluation framework including HuggingFace dataset integration.
Description
This module implements `_load_hf` for loading MEGA-Bench datasets from HuggingFace with caching, `_get_scoring_functions` for extracting metric configurations from task metadata, and `_determine_eval_style` for classifying tasks as rule-based or LLM-based evaluation. It uses module-level caches (`_DATASET_CACHE`, `_SCORING_FUNCTIONS_CACHE`) for efficient repeated access to dataset and scoring function information across core and open task subsets.
Usage
Called internally by the corresponding dataset class during evaluation.
Code Reference
- Source:
vlmeval/dataset/utils/megabench/tools/analysis_utils.py, Lines: L1-182 - Import:
from vlmeval.dataset.utils.megabench.tools.analysis_utils import _load_hf, _get_scoring_functions
Key Functions:
def _load_hf(subset_name: str) -> List[Dict[str, Any]]: ...
def _get_scoring_functions(): ...
def _determine_eval_style(task): ...
I/O Contract
| Direction | Description |
|---|---|
| Inputs | HuggingFace dataset subset names ("core", "open"); task metadata dictionaries |
| Outputs | Task dictionaries keyed by task name; scoring function configurations; evaluation style strings ("rule" or "llm") |
Usage Examples
from vlmeval.dataset.utils.megabench.tools.analysis_utils import _load_hf
task_dict = _load_hf("core")