Implementation:EvolvingLMMs Lab Lmms eval Dataset Loading
| Knowledge Sources | |
|---|---|
| Domains | Data_Processing, Evaluation |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
Concrete tool for retrieving and preparing evaluation datasets provided by the lmms-eval framework.
Description
The ConfigurableTask.download() method is the primary entry point for dataset loading in lmms-eval. It wraps the HuggingFace datasets.load_dataset() call with retry logic, download configuration, and special handling for video datasets that may need to be fetched from YouTube or extracted from zip/tar archives.
After loading the raw dataset, the method optionally applies a process_docs function (specified in the task YAML via the !function directive) to each relevant split. It then creates a parallel dataset_no_image copy with all Image, Sequence[Image], and Audio columns removed, which is used for lightweight operations such as logging and serialization.
The method is decorated with @retry from the tenacity library, configured to retry up to 5 attempts or 60 seconds with a 2-second fixed wait between attempts, ensuring robustness against transient network failures during dataset download.
Usage
Use this when defining a custom task. Set dataset_path and optionally dataset_name in your YAML configuration. The download is triggered automatically during ConfigurableTask.__init__(). For datasets requiring special handling (video downloads, local disk loading), use dataset_kwargs in the YAML.
Code Reference
Source Location
- Repository: lmms-eval
- File:
lmms_eval/api/task.py - Lines: 892-1103
Signature
@retry(stop=(stop_after_attempt(5) | stop_after_delay(60)), wait=wait_fixed(2))
def download(self, dataset_kwargs=None) -> None:
Import
from lmms_eval.api.task import ConfigurableTask
# download() is called internally during ConfigurableTask.__init__()
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| dataset_kwargs | Optional[dict] |
No | Additional keyword arguments forwarded to datasets.load_dataset(). May contain special keys like "video", "From_YouTube", "load_from_disk", "builder_script", "cache_dir", "local_files_only".
|
The following fields are read from the task's YAML configuration (via self.config and self.DATASET_PATH / self.DATASET_NAME):
| Name | Type | Required | Description |
|---|---|---|---|
| dataset_path | str |
Yes | HuggingFace dataset repository identifier (e.g., "lmms-lab/MME") or a local path.
|
| dataset_name | Optional[str] |
No | Name of a specific subset/configuration within the dataset repository. |
| process_docs | Optional[Callable] |
No | A function referenced via !function in YAML that transforms a Dataset split. Applied to each split after loading.
|
| test_split | Optional[str] |
No | Name of the test split (e.g., "test").
|
| training_split | Optional[str] |
No | Name of the training split. |
| validation_split | Optional[str] |
No | Name of the validation split. |
| fewshot_split | Optional[str] |
No | Name of the fewshot split. |
Outputs
| Name | Type | Description |
|---|---|---|
| self.dataset | datasets.DatasetDict |
The loaded HuggingFace DatasetDict containing all splits with full media columns. |
| self.dataset_no_image | datasets.DatasetDict |
A lightweight copy of the dataset with Image, Sequence[Image], and Audio columns removed. |
Usage Examples
Basic Example
# In your task YAML file (e.g., lmms_eval/tasks/my_task/my_task.yaml):
# dataset_path: lmms-lab/MME
# dataset_name: null
# test_split: test
# process_docs: !function utils.my_preprocess
# The download happens automatically when the task is initialized.
# You do not call download() directly in normal usage.
# Equivalent internal call:
from lmms_eval.api.task import ConfigurableTask
task = ConfigurableTask(config={
"task": "my_task",
"dataset_path": "lmms-lab/MME",
"test_split": "test",
"output_type": "generate_until",
"doc_to_text": "question",
"doc_to_target": "answer",
})
# task.dataset is now a loaded DatasetDict
# task.dataset["test"] contains the test split documents
With process_docs Preprocessing
# In utils.py alongside your YAML:
def my_preprocess(dataset):
"""Filter and augment the dataset."""
# Filter to only include rows with valid images
dataset = dataset.filter(lambda x: x["image"] is not None)
# Add a derived column
dataset = dataset.map(lambda x: {
"full_prompt": f"Question: {x['question']}\nAnswer:"
})
return dataset
# In your YAML:
# process_docs: !function utils.my_preprocess
With Video Dataset Loading
# For datasets with video content, use dataset_kwargs in YAML:
# dataset_path: lmms-lab/MyVideoDataset
# dataset_kwargs:
# video: true
# cache_dir: MyVideoDataset/videos