Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Openai Evals Eval Resolution

From Leeroopedia
Knowledge Sources
Domains Evaluation, Configuration
Last Updated 2026-02-14 10:00 GMT

Overview

A registry lookup mechanism that resolves an eval name string to a structured specification describing which eval class and arguments to use.

Description

Eval Resolution converts a user-provided eval name (such as "test-match") into an EvalSpec dataclass containing the fully-qualified class path, constructor arguments, and registry metadata. The registry loads all YAML files from evals/registry/evals/ directories and supports alias dereferencing, where an eval name can point to another eval name. The resolved EvalSpec is then used to instantiate the actual Eval class for execution.

Usage

Use eval resolution whenever a named evaluation needs to be looked up before execution. This is the second positional argument in the oaieval CLI and is also used by oaievalset when iterating over eval sets.

Theoretical Basis

The resolution process follows a dictionary-based lookup with alias support:

# Pseudocode for eval resolution
def resolve_eval(name):
    raw_registry = load_yaml_files("evals/registry/evals/")
    while raw_registry[name] is alias:
        name = raw_registry[name]
    spec = raw_registry[name]
    return EvalSpec(cls=spec["cls"], args=spec.get("args"), ...)

The EvalSpec dataclass contains:

  • cls — Fully qualified class path (e.g. "evals.elsuite.basic.match.Match")
  • args — Constructor keyword arguments including dataset path
  • registry_path — Path to the registry that defined this eval
  • key — The canonical eval name
  • group — YAML filename grouping

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment