Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Openai Openai python Eval Create Response

From Leeroopedia
Knowledge Sources
Domains API_Types, Python
Last Updated 2026-02-15 00:00 GMT

Overview

Concrete type for the evaluation creation response object provided by the openai-python SDK.

Description

The EvalCreateResponse Pydantic model represents an Eval object returned after creating an evaluation. An Eval represents a task for your LLM integration, such as improving chatbot quality or comparing models. It contains an id, created_at timestamp, data_source_config (a discriminated union of EvalCustomDataSourceConfig, DataSourceConfigLogs, or EvalStoredCompletionsDataSourceConfig), optional metadata, a name, an object type fixed to "eval", and a testing_criteria list of grader variants (LabelModelGrader, StringCheckGrader, TextSimilarityGrader, PythonGrader, or ScoreModelGrader with pass_threshold extensions).

Usage

Import this type when inspecting the return value of client.evals.create().

Code Reference

Source Location

Signature

class EvalCreateResponse(BaseModel):
    id: str
    created_at: int
    data_source_config: DataSourceConfig
    metadata: Optional[Metadata] = None
    name: str
    object: Literal["eval"]
    testing_criteria: List[TestingCriterion]

DataSourceConfig = Annotated[
    Union[EvalCustomDataSourceConfig, DataSourceConfigLogs, EvalStoredCompletionsDataSourceConfig],
    PropertyInfo(discriminator="type"),
]

TestingCriterion = Union[
    LabelModelGrader, StringCheckGrader,
    TestingCriterionEvalGraderTextSimilarity,
    TestingCriterionEvalGraderPython,
    TestingCriterionEvalGraderScoreModel,
]

Import

from openai.types import EvalCreateResponse

I/O Contract

Fields

Name Type Required Description
id str Yes Unique identifier for the evaluation
created_at int Yes Unix timestamp (seconds) when the eval was created
data_source_config DataSourceConfig Yes Configuration of data sources used in evaluation runs
metadata Optional[Metadata] No Up to 16 key-value pairs for additional information
name str Yes Name of the evaluation
object Literal["eval"] Yes Object type, always "eval"
testing_criteria List[TestingCriterion] Yes List of graders for the evaluation

Usage Examples

from openai import OpenAI

client = OpenAI()

response = client.evals.create(
    name="Quality Check",
    data_source_config={"type": "logs", "metadata": {"usecase": "chatbot"}},
    testing_criteria=[
        {
            "type": "label_model",
            "name": "accuracy",
            "model": "gpt-4o",
            "input": [{"role": "user", "content": "Evaluate: {{sample.output_text}}"}],
            "labels": ["pass", "fail"],
            "passing_labels": ["pass"],
        }
    ],
)
print(response.id)
print(response.name)
print(response.data_source_config.type)
for criterion in response.testing_criteria:
    print(f"  Grader: {criterion}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment