Principle:EvolvingLMMs Lab Lmms eval Client Integration
| Knowledge Sources | |
|---|---|
| Domains | Server, Client |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
Programmatic evaluation orchestration via synchronous and asynchronous Python clients that abstract the HTTP API.
Description
Client Integration provides Python classes that encapsulate all HTTP communication with the lmms-eval evaluation server, allowing developers to submit jobs, poll for results, manage the queue, and discover available models and tasks without writing raw HTTP requests. The framework offers two client variants: a synchronous EvalClient for straightforward scripting and an asynchronous AsyncEvalClient for integration into async applications.
Both clients share the same core capabilities:
- Job Submission (
evaluate()): Accepts the same parameters as thePOST /evaluateendpoint and returns a dictionary containing the job ID, status, and queue position. None-valued optional parameters are automatically stripped from the request payload.
- Status Polling (
get_job()): Retrieves the fullJobInfofor a given job ID, including status, timestamps, request parameters, results, and errors.
- Blocking Wait (
wait_for_job()): Continuously polls the server at a configurable interval until the job reaches a terminal state. RaisesRuntimeErrorif the job fails andTimeoutErrorif an optional timeout is exceeded. Optionally prints status updates to stdout.
- Queue Inspection (
get_queue_status()): Returns aggregate queue statistics including queue depth, the running job ID, and counts of completed and failed jobs.
- Job Cancellation (
cancel_job()): Cancels a queued job by its ID.
- Discovery (
list_tasks(),list_models(),health()): Query server metadata including available evaluation tasks, supported model types, and server health status.
Both clients support context manager usage (with/async with) for proper HTTP connection cleanup.
Usage
Use the Client Integration principle when you need to:
- Write Python scripts or notebooks that submit and collect evaluation results programmatically
- Build higher-level orchestration tools that coordinate multiple evaluation runs
- Integrate lmms-eval into async frameworks like FastAPI services, Celery tasks, or event-driven pipelines
- Create automated evaluation workflows that submit, wait, and process results in a single script
Theoretical Basis
The Client Integration design follows the service client abstraction pattern:
Thin HTTP Wrapper: Both clients delegate all transport to the httpx library, adding a minimal layer that constructs URLs, serializes parameters, and deserializes responses. This keeps the client code simple and benefits from httpx's robust connection pooling, timeout handling, and HTTP/2 support.
Shared Job Status Processing: A module-level _process_job_status() function handles the logic for interpreting terminal job states (completed, failed) and is shared between the sync and async clients. This avoids code duplication and ensures consistent behavior regardless of which client variant is used.
Synchronous vs. Asynchronous Parity: The EvalClient uses httpx.Client (blocking) while AsyncEvalClient uses httpx.AsyncClient (non-blocking). The async client's wait_for_job() uses asyncio.sleep() instead of time.sleep(), yielding control to the event loop between polls. Both expose identical method signatures, making it straightforward to migrate code between sync and async contexts.
Configurable Timeout and Polling: The wait_for_job() method accepts both a poll_interval (time between status checks) and a timeout (maximum total wait time). This allows callers to balance responsiveness against server load. The default 5-second poll interval is suitable for evaluation jobs that typically run for minutes.
Resource Management: Both clients implement context manager protocols and destructor-based cleanup. The async client issues a ResourceWarning if it is garbage-collected without being properly closed, helping developers identify resource leaks during development.
None Stripping: The evaluate() method removes None-valued parameters from the request payload before sending. This allows callers to pass through all possible parameters while relying on server-side defaults for unspecified values, producing cleaner HTTP requests.