Implementation:EvolvingLMMs Lab Lmms eval Lmms Generate Until

Knowledge Sources	lmms-eval
Domains	Evaluation, Model_Inference
Last Updated	2026-02-14 00:00 GMT

Overview

Concrete tool for dispatching evaluation requests to the model for inference, supporting generation and loglikelihood tasks, provided by the lmms-eval framework.

Description

The lmms class in lmms_eval/api/model.py is the abstract base class for all model implementations in the framework. It defines three abstract methods -- generate_until, loglikelihood, and generate_until_multi_round -- that every concrete model must implement.

The generate_until method receives a list of Instance objects and must return a list of generated strings. Each Instance's args contains the prompt context, generation kwargs (including stopping sequences, temperature, and sampling parameters), and a reference to the visual input loader.

The loglikelihood method receives a list of Instance objects, each containing a context-continuation pair, and returns log probabilities with greedy-match indicators.

The generate_until_multi_round method extends generation to multi-round dialogs where subsequent prompts can depend on the model's previous outputs.

The class also provides infrastructure for:

Caching -- A JSONL-based caching mechanism (LMMS_EVAL_USE_CACHE) that stores and retrieves responses to avoid redundant inference.
Distributed execution -- Rank and world-size tracking for multi-GPU evaluation.
Argument parsing -- The create_from_arg_string() classmethod for instantiating models from CLI argument strings.
Memory management -- The clean() method for freeing GPU memory after inference.

Usage

Use these methods when:

You are implementing a new model backend and need to conform to the evaluation interface.
You are running an evaluation and the evaluator dispatches requests via getattr(lm, reqtype)(reqs).
You need to understand the expected input/output contract for model inference.

Code Reference

Source Location

Repository: lmms-eval
File: lmms_eval/api/model.py
Lines: 253-270 (generate_until), 225-250 (loglikelihood), 272-289 (generate_until_multi_round)

Signature

class lmms(abc.ABC):
    is_simple: bool = True

    @abc.abstractmethod
    def generate_until(self, requests: list) -> List[str]:
        """Generate greedily until a stopping sequence.

        :param requests: list[Instance]
            Each Instance's args contains
            (context, generation_kwargs, doc_to_visual,
             doc_id, task, split).
        :return: list[str]
            A list of generated continuations.
        """
        pass

    @abc.abstractmethod
    def loglikelihood(
        self, requests: List[Instance]
    ) -> List[Tuple[float, bool]]:
        """Compute log-likelihood of generating a continuation
        from a context.

        :param requests: list[Instance]
            Each Instance's args contains
            (context, continuation, doc_to_visual,
             doc_id, task, split).
        :return: list[tuple[float, bool]]
            (logprob, is_greedy) pairs.
        """
        pass

    @abc.abstractmethod
    def generate_until_multi_round(
        self, requests: list
    ) -> List[str]:
        """Multi-round dialog generation.

        :param requests: list[Instance]
        :return: list[str]
        """
        pass

    @classmethod
    def create_from_arg_string(
        cls: Type[T],
        arg_string: str,
        additional_config: Optional[dict] = None,
    ) -> T:
        """Create model instance from key=value argument string."""
        ...

Import

from lmms_eval.api.model import lmms

I/O Contract

Inputs

Name	Type	Required	Description
requests	list[Instance]	Yes	List of Instance objects containing prompts, generation kwargs, visual input loaders, and metadata
request.arguments[0]	str	Yes	The prompt context string (or message list for chat models)
request.arguments[1]	dict	Yes	Generation kwargs including `until` (stop sequences), `do_sample`, `temperature`
request.arguments[2]	Callable	Yes	`doc_to_visual` function that loads visual inputs for the document
request.arguments[3]	int	Yes	Document ID within the evaluation split
request.arguments[4]	str	Yes	Task name string
request.arguments[5]	str	Yes	Split name (e.g., "test", "validation")

Outputs

Name	Type	Description
generate_until return	List[str]	List of generated text continuations, one per request
loglikelihood return	List[Tuple[float, bool]]	List of (log_probability, is_greedy) tuples, one per request
generate_until_multi_round return	List[str]	List of final-round generated text continuations

Usage Examples

Basic Example

from lmms_eval.api.model import lmms
from lmms_eval.api.instance import Instance

# Assuming a model instance `lm` is already created
# Dispatch is done via the evaluator:
reqtype = "generate_until"
resps = getattr(lm, reqtype)(cloned_reqs)

# Each response is appended to the request
for resp, req in zip(resps, cloned_reqs):
    req.resps.append(resp)

Implementing a New Model

from lmms_eval.api.model import lmms
from lmms_eval.api.instance import Instance
from typing import List, Tuple

class MyCustomModel(lmms):
    is_simple = True

    def __init__(self, pretrained: str, **kwargs):
        super().__init__()
        # Load your model here
        self.model = load_model(pretrained)

    def generate_until(
        self, requests: list
    ) -> List[str]:
        results = []
        for req in requests:
            context = req.arguments[0]
            gen_kwargs = req.arguments[1]
            visuals = req.arguments[2](
                self.task_dict[req.arguments[4]][req.arguments[3]]
            )
            output = self.model.generate(
                context, visuals, **gen_kwargs
            )
            results.append(output)
        return results

    def loglikelihood(
        self, requests: List[Instance]
    ) -> List[Tuple[float, bool]]:
        # Implement log-likelihood computation
        ...

    def generate_until_multi_round(
        self, requests: list
    ) -> List[str]:
        # Implement multi-round generation
        ...

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment