Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Openai Evals Eval Progress Tracking

From Leeroopedia
Knowledge Sources
Domains Evaluation, Reliability
Last Updated 2026-02-14 10:00 GMT

Overview

A checkpoint-based progress tracking mechanism that enables resumption of interrupted batch evaluation runs.

Description

Eval Progress Tracking maintains a persistent record of completed evaluation commands in a JSON lines file. When a batch run is interrupted (due to error, timeout, or manual cancellation), the progress file allows resumption from the last completed eval rather than restarting from scratch. Each completed command is recorded as a JSON array of the command tokens, and on resume, these are compared against the full command list to determine which evals to skip.

Usage

Progress tracking is used automatically by oaievalset when the --resume flag is enabled (the default). The progress file is stored at /tmp/oaievalset/{model}.{eval_set}.progress.txt.

Theoretical Basis

The progress tracking follows a simple append-log pattern:

  1. On start, load existing progress file (if it exists and resume is enabled)
  2. Before execution, filter command list to exclude already-completed commands
  3. After each successful eval, append the command to the progress file
  4. On resume, the file is re-read to reconstruct the completed set

Related Pages

Implemented By

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment