Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:EvolvingLMMs Lab Lmms eval Job Submission

From Leeroopedia
Knowledge Sources
Domains Server, Evaluation
Last Updated 2026-02-14 00:00 GMT

Overview

Submitting evaluation jobs via a REST API with request validation and queue placement for asynchronous processing.

Description

Job Submission is the mechanism by which evaluation workloads are introduced into the lmms-eval server for processing. A client sends a structured request describing the model, tasks, and evaluation parameters. The server validates the request, assigns a unique identifier, places the job into a processing queue, and immediately returns a response containing the job ID and queue position -- without waiting for the evaluation to complete.

This design decouples job submission from execution, enabling clients to submit multiple evaluation jobs and track their progress independently. The submission workflow has three stages:

  1. Request Validation: The incoming JSON body is validated against the EvaluateRequest Pydantic model. Required fields (model and tasks) must be present. Optional fields like model_args, batch_size, device, limit, gen_kwargs, num_fewshot, log_samples, predict_only, and num_gpus provide fine-grained control over the evaluation run.
  1. Job Creation: The scheduler generates a UUID4 identifier for the job, creates a JobInfo record with status QUEUED and a creation timestamp, and stores it in its internal job registry under an async lock to ensure thread-safety.
  1. Queue Placement: The job ID is placed onto an asyncio.Queue. The queue position (zero-indexed count of items already in the queue) is captured at insertion time and returned to the client.

The immediate response enables fire-and-forget submission patterns, where the client can poll for status later or use the synchronous wait_for_job client method to block until completion.

Usage

Use the Job Submission principle when you need to:

  • Queue one or more model evaluation runs without blocking on their completion
  • Integrate evaluation into automated pipelines that submit jobs and collect results asynchronously
  • Validate evaluation parameters before committing GPU resources to a run
  • Obtain a job identifier for subsequent status polling, result retrieval, or cancellation

Theoretical Basis

The Job Submission design follows the asynchronous command pattern from distributed systems:

Immediate Acknowledgment: The server responds with a JobSubmitResponse as soon as the job is queued, rather than waiting for the evaluation to complete. This keeps HTTP request latency low and prevents client timeouts for long-running GPU workloads that may take minutes or hours.

Schema-Driven Validation: Using Pydantic models for request validation ensures that malformed requests are rejected at the API boundary before consuming scheduler resources. FastAPI automatically generates 422 Unprocessable Entity responses for invalid payloads, with detailed error messages indicating which fields failed validation.

UUID4 Job Identifiers: Each job receives a universally unique identifier generated by uuid.uuid4(). This avoids the need for a centralized ID counter and ensures job IDs are unpredictable, which is useful in multi-tenant or shared-server scenarios.

Queue Position Tracking: Returning the queue position at submission time gives the client an immediate sense of expected wait time. The position is computed from the current queue size under a lock to avoid race conditions with concurrent submissions.

Sequential Execution Guarantee: Jobs are placed in a FIFO queue and processed one at a time by the background worker. This sequential execution model prevents GPU memory contention and ensures that each evaluation has exclusive access to hardware resources. While this limits throughput to one job at a time, it provides predictable resource usage and avoids out-of-memory failures from concurrent model loading.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment