Environment:EvolvingLMMs Lab Lmms eval Python Runtime Environment

Knowledge Sources	lmms-eval pyproject.toml
Domains	Infrastructure, Multimodal_Evaluation
Last Updated	2026-02-14 00:00 GMT

Overview

Python 3.9+ runtime environment with PyTorch 2.1+, Transformers 4.39+, and Accelerate 0.29+ for multimodal language model evaluation.

Description

This environment defines the core Python runtime and dependency stack required for the lmms-eval framework. It is built on Python 3.9 or higher and requires PyTorch 2.1.0+ (to enable SDPA attention for running 34B-parameter models on a single 80GB GPU). The framework depends on HuggingFace Transformers (4.39.2+), Accelerate (0.29.1+), and Datasets (2.19.0+) as its primary ML backbone. Additional scientific computing libraries (NumPy 1.26.4+, scikit-learn 0.24.1+), NLP tooling (NLTK, sacrebleu, sentencepiece), and video processing (av <16.0.0, torchvision 0.16.0+) complete the stack.

Usage

Use this environment for any lmms-eval evaluation workflow. It is a mandatory prerequisite for all five core workflows: End-to-End Evaluation, Custom Task Creation, Custom Model Integration, Server Mode Evaluation, and Distributed Multi-GPU Evaluation.

System Requirements

Category	Requirement	Notes
OS	Linux (Ubuntu 20.04+ recommended)	macOS supported with limitations (see Compatibility Notes)
Python	>= 3.9	No upper bound specified
Hardware	NVIDIA GPU recommended	CPU-only mode available but slow
Disk	10GB+ for dependencies	Additional space for model weights and dataset caches

Dependencies

Core Python Packages

torch >= 2.1.0 — Required for SDPA attention mode
torchvision >= 0.16.0
transformers >= 4.39.2
accelerate >= 0.29.1
datasets >= 2.19.0
evaluate >= 0.4.0
peft >= 0.2.0
numpy >= 1.26.4
scikit-learn >= 0.24.1
wandb >= 0.16.0
tenacity >= 8.3.0
sacrebleu >= 1.5.0
av < 16.0.0 — Upper bound constraint on video codec library
qwen-vl-utils >= 0.0.14
openai — For GPT-based evaluation judges
pydantic, Jinja2, pyyaml — Configuration and templating
loguru — Logging framework
python-dotenv — Environment variable loading

Optional Dependency Groups

server: fastapi, uvicorn
audio: librosa, soundfile, editdistance, zhconv
metrics: spacy, anls, rouge, Levenshtein
video: decord (Linux), eva-decord (macOS with Python < 3.12)
tui: fastapi >= 0.100.0, uvicorn >= 0.20.0
gemini: google-generativeai
qwen: decord, qwen_vl_utils

Credentials

No credentials are required for the base Python runtime. See Environment:EvolvingLMMs_Lab_Lmms_eval_API_Credentials_Environment for API key requirements.

Quick Install

# Basic installation
pip install lmms_eval

# With all optional dependencies
pip install lmms_eval[all]

# Using uv (recommended by project)
uv sync

Code Evidence

Python version constraint from pyproject.toml:22:

requires-python = ">=3.9"

PyTorch version constraint with SDPA rationale from pyproject.toml:40:

"torch>=2.1.0", # to enable sdpa mode for running 34B model on one 80GB GPU

Video codec upper bound from pyproject.toml:47:

"av<16.0.0",

Platform-conditional video dependency from pyproject.toml:135-138:

video = [
    "decord; platform_system != 'Darwin'",
    "eva-decord; platform_system == 'Darwin' and python_version < '3.12'",
]

Optional import system from lmms_eval/imports.py:32-63:

@lru_cache(maxsize=128)
def is_package_available(package_name: str) -> bool:
    """Check if a package is installed (cached)."""
    return importlib.util.find_spec(package_name) is not None

def optional_import(module_name, attribute=None, fallback=None):
    try:
        module = importlib.import_module(module_name)
        if attribute is not None:
            return getattr(module, attribute), True
        return module, True
    except (ImportError, AttributeError):
        return fallback, False

Common Errors

Error Message	Cause	Solution
`MissingOptionalDependencyError: 'textual' is required for TUI`	TUI mode requires optional dependency	`pip install lmms_eval[tui]`
`ImportError: decord`	Video decoding library not installed	`pip install lmms_eval[video]` or `pip install decord`
`torch>=2.1.0 required`	Older PyTorch version installed	`pip install torch>=2.1.0`
`av` version incompatibility	av >= 16.0.0 installed	`pip install 'av<16.0.0'`

Compatibility Notes

macOS (Darwin): Uses eva-decord instead of decord for video processing. Only supported with Python < 3.12.
Python 3.12+: macOS video decoding may not be available (eva-decord compatibility limit).
CPU-only: Framework runs on CPU but GPU is strongly recommended for practical evaluation speeds.
Windows: Not officially tested. Use WSL2 for best compatibility.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment