Principle:Openai Evals Eval Resolution

Knowledge Sources	OpenAI Evals
Domains	Evaluation, Configuration
Last Updated	2026-02-14 10:00 GMT

Overview

A registry lookup mechanism that resolves an eval name string to a structured specification describing which eval class and arguments to use.

Description

Eval Resolution converts a user-provided eval name (such as "test-match") into an EvalSpec dataclass containing the fully-qualified class path, constructor arguments, and registry metadata. The registry loads all YAML files from evals/registry/evals/ directories and supports alias dereferencing, where an eval name can point to another eval name. The resolved EvalSpec is then used to instantiate the actual Eval class for execution.

Usage

Use eval resolution whenever a named evaluation needs to be looked up before execution. This is the second positional argument in the oaieval CLI and is also used by oaievalset when iterating over eval sets.

Theoretical Basis

The resolution process follows a dictionary-based lookup with alias support:

# Pseudocode for eval resolution
def resolve_eval(name):
    raw_registry = load_yaml_files("evals/registry/evals/")
    while raw_registry[name] is alias:
        name = raw_registry[name]
    spec = raw_registry[name]
    return EvalSpec(cls=spec["cls"], args=spec.get("args"), ...)

The EvalSpec dataclass contains:

cls — Fully qualified class path (e.g. "evals.elsuite.basic.match.Match")
args — Constructor keyword arguments including dataset path
registry_path — Path to the registry that defined this eval
key — The canonical eval name
group — YAML filename grouping

Related Pages

Implemented By

Implementation:Openai_Evals_Registry_Get_Eval

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment