Principle:EvolvingLMMs Lab Lmms eval Environment Setup

Knowledge Sources	lmms-eval
Domains	Evaluation, Infrastructure
Last Updated	2026-02-14 00:00 GMT

Overview

Environment setup is the process of installing all required dependencies and configuring entry points so that an evaluation framework can be invoked from the command line or imported as a library.

Description

Before any multi-modal model evaluation can take place, the runtime environment must contain the correct versions of dozens of interdependent packages spanning deep learning inference, dataset loading, metrics computation, and result formatting. The lmms-eval framework codifies these requirements in a single pyproject.toml manifest that enumerates core dependencies, optional feature groups (audio, video, metrics, server, TUI), and console script entry points.

The core dependency set includes:

accelerate (>=0.29.1) -- Hugging Face device-placement and distributed launch utilities.
datasets (>=2.19.0) -- Hugging Face Datasets for streaming and caching evaluation data.
torch (>=2.1.0) -- Required for SDPA attention and running large models on single GPUs.
transformers (>=4.39.2) -- Model tokenizers, configs, and stopping-criteria utilities.

Beyond core dependencies, optional groups such as [audio], [video], [metrics], and [server] allow users to install only what is needed for their specific evaluation tasks. The [all] extra pulls in every optional group.

The framework registers two console entry points:

lmms-eval -- maps to lmms_eval.__main__:cli_evaluate, the primary evaluation CLI.
lmms-eval-ui -- maps to lmms_eval.tui.cli:main, a terminal UI for browsing results.

Usage

Use environment setup whenever:

You are setting up a new machine or CI runner for evaluation.
You need to add a new optional dependency group (e.g., adding video decoding support).
You want to ensure reproducible environments across team members via lockfile-based installation (uv sync).
You are troubleshooting version conflicts between torch, transformers, and accelerate.

Theoretical Basis

The dependency resolution process follows PEP 517/518 build-system conventions:

The build backend (setuptools) reads pyproject.toml.
Core and optional dependencies are resolved against PyPI (or a configured index).
The solver produces a locked dependency graph satisfying all version constraints.
Console entry points are installed as executable shims pointing to the specified Python callables.

The key constraint in multi-modal evaluation frameworks is the torch-transformers-accelerate triangle: these three libraries must be version-compatible to enable features like SDPA attention, FSDP sharding, and automatic device mapping. The minimum versions declared in pyproject.toml encode the known-good lower bounds for this compatibility.

For reproducibility, the project supports lockfile-based workflows where uv sync reads uv.lock to install bit-identical environments across machines.

Related Pages

Implemented By

Implementation:EvolvingLMMs_Lab_Lmms_eval_Package_Installation

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment