Principle:Openai Evals Eval Registration
| Knowledge Sources | |
|---|---|
| Domains | Evaluation, Configuration |
| Last Updated | 2026-02-14 10:00 GMT |
Overview
A YAML-based configuration pattern that registers evaluation classes with the framework by defining their class path, constructor arguments, and metadata.
Description
Eval Registration is the mechanism by which new evaluations are made discoverable to the oaieval CLI and the Registry system. Each eval is defined as a YAML entry in files under evals/registry/evals/, specifying the fully-qualified Python class path and constructor arguments. The registration supports a two-level hierarchy: a base eval entry (with metrics and description) and a versioned split entry (with the actual class and args). Aliases allow one eval name to point to another, enabling versioning workflows.
Usage
Register an eval after implementing an Eval class or when configuring a built-in template with new data. This is the step that makes the eval available via oaieval <model> <eval-name>.
Theoretical Basis
The registration follows a declarative configuration pattern:
Two-level YAML structure:
# Level 1: Base eval (metadata)
my-eval:
id: my-eval.dev.v0
metrics: [accuracy]
description: "My custom evaluation"
# Level 2: Versioned split (implementation)
my-eval.dev.v0:
class: evals.elsuite.basic.match.Match
args:
samples_jsonl: my_data/test.jsonl
max_tokens: 100
Key conventions:
- Base eval name has no dots (e.g. "my-eval")
- Splits use dot notation (e.g. "my-eval.dev.v0")
- The id field in the base eval points to the default split
- class must be a fully-qualified Python class path
- args are passed as keyword arguments to the class constructor