Implementation:EvolvingLMMs Lab Lmms eval Task Build All Requests

Knowledge Sources	lmms-eval
Domains	Evaluation, Data_Processing
Last Updated	2026-02-14 00:00 GMT

Overview

Concrete tool for building evaluation request instances from task datasets, with sharding support for distributed evaluation, provided by the lmms-eval framework.

Description

The Task.build_all_requests() method is the core request construction routine. It iterates over the evaluation documents (from the test or validation split), builds few-shot context for each document, and calls construct_requests() to create Instance objects. The method supports distributed evaluation through rank-based sharding, request caching for repeated runs, and optional limiting for debugging.

For ConfigurableTask, the construct_requests() method creates different Instance types based on OUTPUT_TYPE:

generate_until -- A single Instance with the context and generation kwargs.
multiple_choice -- One Instance per answer choice, each with a loglikelihood request.
loglikelihood -- A single Instance for computing the log probability of the target.
generate_until_multi_round -- A single Instance with an additional callable for multi-round dialog.

After all instances are constructed, they are flattened from a list-of-lists into a single list and optionally cached for future reuse. The method also restores doc_to_visual references on cached instances since callables are not serializable.

Usage

Use build_all_requests() when you need to:

Prepare evaluation requests for a task before dispatching them to the model.
Enable distributed evaluation by specifying rank and world size.
Cache request construction for efficiency across repeated runs.
Limit the number of evaluation documents for debugging.

Code Reference

Source Location

Repository: lmms-eval
File: lmms_eval/api/task.py
Lines: 382-511

Signature

def build_all_requests(
    self,
    *,
    limit: Union[int, None] = None,
    offset: int = 0,
    rank: int = 0,
    world_size: int = 1,
    cache_requests: bool = False,
    rewrite_requests_cache: bool = False,
    system_instruction: Optional[str] = None,
    apply_chat_template: bool = False,
    fewshot_as_multiturn: bool = False,
    chat_template: Optional[Callable] = None,
    tokenizer_name: str = "",
) -> None:
    """Build a set of Instances for a task, and store them
    in task.instances"""

Import

from lmms_eval.api.task import Task, ConfigurableTask

I/O Contract

Inputs

Name	Type	Required	Description
limit	Optional[int]	No	Maximum number of documents to evaluate (None = all documents)
offset	int	No	Starting offset into the dataset before rank sharding (default: 0)
rank	int	No	Current process rank for distributed sharding (default: 0)
world_size	int	No	Total number of processes for distributed sharding (default: 1)
cache_requests	bool	No	Whether to cache constructed instances to disk (default: False)
rewrite_requests_cache	bool	No	Whether to overwrite existing cache (default: False)
system_instruction	Optional[str]	No	System instruction prepended to prompts
apply_chat_template	bool	No	Whether to format prompts using chat template (default: False)
fewshot_as_multiturn	bool	No	Whether to format few-shot examples as multi-turn conversation (default: False)
chat_template	Optional[Callable]	No	Callable that renders chat message lists to strings
tokenizer_name	str	No	Tokenizer identifier for cache key differentiation (default: "")

Outputs

Name	Type	Description
self._instances	List[Instance]	Flattened list of Instance objects, each containing (context, args, doc_to_visual, doc_id, task, split) and metadata

Usage Examples

Basic Example

from lmms_eval.tasks import TaskManager

# Load a task
tm = TaskManager()
task_dict = tm.load_task_or_group(["mme"])

# Access the task object
task = list(task_dict.values())[0]

# Build requests for all documents (single GPU)
task.build_all_requests(limit=10, rank=0, world_size=1)

# Inspect the constructed instances
print(len(task.instances))
for inst in task.instances[:3]:
    print(inst.request_type)    # 'generate_until'
    print(inst.arguments[0][:80])  # First 80 chars of context

Distributed Example

# On rank 0 of 4 GPUs
task.build_all_requests(
    limit=100,
    rank=0,
    world_size=4,
    offset=0,
)

# On rank 1 of 4 GPUs
task.build_all_requests(
    limit=100,
    rank=1,
    world_size=4,
    offset=0,
)
# Each rank gets a disjoint ~25-document shard

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment