Implementation:EvolvingLMMs Lab Lmms eval Task Build All Requests
| Knowledge Sources | |
|---|---|
| Domains | Evaluation, Data_Processing |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
Concrete tool for building evaluation request instances from task datasets, with sharding support for distributed evaluation, provided by the lmms-eval framework.
Description
The Task.build_all_requests() method is the core request construction routine. It iterates over the evaluation documents (from the test or validation split), builds few-shot context for each document, and calls construct_requests() to create Instance objects. The method supports distributed evaluation through rank-based sharding, request caching for repeated runs, and optional limiting for debugging.
For ConfigurableTask, the construct_requests() method creates different Instance types based on OUTPUT_TYPE:
- generate_until -- A single Instance with the context and generation kwargs.
- multiple_choice -- One Instance per answer choice, each with a loglikelihood request.
- loglikelihood -- A single Instance for computing the log probability of the target.
- generate_until_multi_round -- A single Instance with an additional callable for multi-round dialog.
After all instances are constructed, they are flattened from a list-of-lists into a single list and optionally cached for future reuse. The method also restores doc_to_visual references on cached instances since callables are not serializable.
Usage
Use build_all_requests() when you need to:
- Prepare evaluation requests for a task before dispatching them to the model.
- Enable distributed evaluation by specifying rank and world size.
- Cache request construction for efficiency across repeated runs.
- Limit the number of evaluation documents for debugging.
Code Reference
Source Location
- Repository: lmms-eval
- File:
lmms_eval/api/task.py - Lines: 382-511
Signature
def build_all_requests(
self,
*,
limit: Union[int, None] = None,
offset: int = 0,
rank: int = 0,
world_size: int = 1,
cache_requests: bool = False,
rewrite_requests_cache: bool = False,
system_instruction: Optional[str] = None,
apply_chat_template: bool = False,
fewshot_as_multiturn: bool = False,
chat_template: Optional[Callable] = None,
tokenizer_name: str = "",
) -> None:
"""Build a set of Instances for a task, and store them
in task.instances"""
Import
from lmms_eval.api.task import Task, ConfigurableTask
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| limit | Optional[int] | No | Maximum number of documents to evaluate (None = all documents) |
| offset | int | No | Starting offset into the dataset before rank sharding (default: 0) |
| rank | int | No | Current process rank for distributed sharding (default: 0) |
| world_size | int | No | Total number of processes for distributed sharding (default: 1) |
| cache_requests | bool | No | Whether to cache constructed instances to disk (default: False) |
| rewrite_requests_cache | bool | No | Whether to overwrite existing cache (default: False) |
| system_instruction | Optional[str] | No | System instruction prepended to prompts |
| apply_chat_template | bool | No | Whether to format prompts using chat template (default: False) |
| fewshot_as_multiturn | bool | No | Whether to format few-shot examples as multi-turn conversation (default: False) |
| chat_template | Optional[Callable] | No | Callable that renders chat message lists to strings |
| tokenizer_name | str | No | Tokenizer identifier for cache key differentiation (default: "") |
Outputs
| Name | Type | Description |
|---|---|---|
| self._instances | List[Instance] | Flattened list of Instance objects, each containing (context, args, doc_to_visual, doc_id, task, split) and metadata |
Usage Examples
Basic Example
from lmms_eval.tasks import TaskManager
# Load a task
tm = TaskManager()
task_dict = tm.load_task_or_group(["mme"])
# Access the task object
task = list(task_dict.values())[0]
# Build requests for all documents (single GPU)
task.build_all_requests(limit=10, rank=0, world_size=1)
# Inspect the constructed instances
print(len(task.instances))
for inst in task.instances[:3]:
print(inst.request_type) # 'generate_until'
print(inst.arguments[0][:80]) # First 80 chars of context
Distributed Example
# On rank 0 of 4 GPUs
task.build_all_requests(
limit=100,
rank=0,
world_size=4,
offset=0,
)
# On rank 1 of 4 GPUs
task.build_all_requests(
limit=100,
rank=1,
world_size=4,
offset=0,
)
# Each rank gets a disjoint ~25-document shard