Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Openai Evals Eval Set Resolution

From Leeroopedia
Knowledge Sources
Domains Evaluation, Configuration
Last Updated 2026-02-14 10:00 GMT

Overview

A registry lookup mechanism that resolves an eval set name to an ordered list of individual eval names for batch execution.

Description

Eval Set Resolution maps a set name (such as "test-basic") to an EvalSetSpec dataclass containing a sequence of eval names. Eval sets are defined as YAML files in evals/registry/eval_sets/ and provide a way to group related evaluations for batch execution. The resolved list of eval names is then iterated by oaievalset to run each eval sequentially. The repository includes 18 pre-defined eval sets covering basic tests, model-graded evaluations, and domain-specific benchmarks.

Usage

Use eval set resolution when running a batch of related evaluations. This is the core lookup for the oaievalset CLI's second positional argument.

Theoretical Basis

An eval set is a simple grouping mechanism:

# Eval set YAML structure
test-basic:
  evals:
    - test-match
    - test-fuzzy-match
    - test-includes

The EvalSetSpec dataclass contains:

  • evals — Ordered sequence of eval name strings
  • key — Canonical set name
  • group — YAML filename grouping

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment