Principle:EvolvingLMMs Lab Lmms eval Request Construction
| Knowledge Sources | |
|---|---|
| Domains | Evaluation, Data_Processing |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
Request construction is the process of transforming evaluation dataset documents into structured inference request objects that a model can consume, including support for few-shot context, distributed sharding, and request caching.
Description
Before a model can be evaluated, each document in the test or validation split must be converted into one or more Instance objects. Each Instance bundles together the textual prompt, generation or loglikelihood parameters, a reference to the visual input loader, the document ID, and metadata about the task and split. This conversion is task-type-aware: generation tasks produce a single Instance per document, while multiple-choice tasks produce one Instance per answer choice.
The request construction process handles several concerns simultaneously:
- Split selection -- Preferring the test split, falling back to validation.
- Few-shot context -- Building a prompt that includes example demonstrations drawn from the training or fewshot split, with configurable shot count, delimiter, and chat template formatting.
- Distributed sharding -- Dividing documents across ranks using
rank,world_size, andoffsetparameters viautils.create_iterator(), which applies interleaved slicing. - Limiting -- Restricting the number of documents via
limitfor debugging or quick sanity checks. - Caching -- Saving constructed instances to disk so that subsequent runs with the same configuration can skip the construction step entirely.
- Repetition -- Supporting
repeats > 1for techniques like majority voting where each document is inferred on multiple times.
Usage
Use request construction whenever:
- You are preparing an evaluation run and need to convert raw dataset documents into model-consumable requests.
- You need to debug prompt formatting by inspecting the constructed Instance objects.
- You are running distributed evaluation and need to ensure each rank processes a disjoint shard.
- You want to cache requests to speed up repeated runs with the same task configuration.
Theoretical Basis
The request construction algorithm proceeds as follows:
Input: Task T with dataset D, configuration parameters (limit, offset, rank, world_size).
Step 1 -- Split Resolution:
if T.has_test_docs():
docs = T.test_docs()
split = config.test_split
elif T.has_validation_docs():
docs = T.validation_docs()
split = config.validation_split
Step 2 -- Iterator Construction (Distributed Sharding):
For a dataset of N documents, rank r in a world of size W with offset o:
indices_for_rank_r = [o + r, o + r + W, o + r + 2W, ...] intersected with [0, N)
if limit is not None:
indices_for_rank_r = indices_for_rank_r[:limit]
Step 3 -- Per-Document Instance Creation:
For each document at index doc_id:
- Build few-shot context string from the fewshot split.
- Call
construct_requests(doc_id, ctx, metadata)which creates Instance objects based onOUTPUT_TYPE. - Append to the instances list.
Step 4 -- Flatten and Cache:
Flatten nested instance lists, slice to the original limit, and optionally save to cache.
The output type determines the Instance structure:
- generate_until -- One Instance with
(context, generation_kwargs, doc_to_visual, doc_id, task, split). - loglikelihood / multiple_choice -- One Instance per choice with
(context, target_string, doc_to_visual, doc_id, task, split). - generate_until_multi_round -- One Instance with an additional
doc_to_textcallable for multi-round interactions.