Principle:Openai Evals Eval Progress Tracking

Knowledge Sources	OpenAI Evals
Domains	Evaluation, Reliability
Last Updated	2026-02-14 10:00 GMT

Overview

A checkpoint-based progress tracking mechanism that enables resumption of interrupted batch evaluation runs.

Description

Eval Progress Tracking maintains a persistent record of completed evaluation commands in a JSON lines file. When a batch run is interrupted (due to error, timeout, or manual cancellation), the progress file allows resumption from the last completed eval rather than restarting from scratch. Each completed command is recorded as a JSON array of the command tokens, and on resume, these are compared against the full command list to determine which evals to skip.

Usage

Progress tracking is used automatically by oaievalset when the --resume flag is enabled (the default). The progress file is stored at /tmp/oaievalset/{model}.{eval_set}.progress.txt.

Theoretical Basis

The progress tracking follows a simple append-log pattern:

On start, load existing progress file (if it exists and resume is enabled)
Before execution, filter command list to exclude already-completed commands
After each successful eval, append the command to the progress file
On resume, the file is re-read to reconstruct the completed set

Related Pages

Implemented By

Implementation:Openai_Evals_Progress_Class

Uses Heuristic

Heuristic:Openai_Evals_Eval_Resumption_Strategy

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment