Implementation:Openai Openai python Eval Create Response

Knowledge Sources	Openai_Openai_python OpenAI API Reference
Domains	API_Types, Python
Last Updated	2026-02-15 00:00 GMT

Overview

Concrete type for the evaluation creation response object provided by the openai-python SDK.

Description

The EvalCreateResponse Pydantic model represents an Eval object returned after creating an evaluation. An Eval represents a task for your LLM integration, such as improving chatbot quality or comparing models. It contains an id, created_at timestamp, data_source_config (a discriminated union of EvalCustomDataSourceConfig, DataSourceConfigLogs, or EvalStoredCompletionsDataSourceConfig), optional metadata, a name, an object type fixed to "eval", and a testing_criteria list of grader variants (LabelModelGrader, StringCheckGrader, TextSimilarityGrader, PythonGrader, or ScoreModelGrader with pass_threshold extensions).

Usage

Import this type when inspecting the return value of client.evals.create().

Code Reference

Source Location

Repository: openai-python
File: src/openai/types/eval_create_response.py

Signature

class EvalCreateResponse(BaseModel):
    id: str
    created_at: int
    data_source_config: DataSourceConfig
    metadata: Optional[Metadata] = None
    name: str
    object: Literal["eval"]
    testing_criteria: List[TestingCriterion]

DataSourceConfig = Annotated[
    Union[EvalCustomDataSourceConfig, DataSourceConfigLogs, EvalStoredCompletionsDataSourceConfig],
    PropertyInfo(discriminator="type"),
]

TestingCriterion = Union[
    LabelModelGrader, StringCheckGrader,
    TestingCriterionEvalGraderTextSimilarity,
    TestingCriterionEvalGraderPython,
    TestingCriterionEvalGraderScoreModel,
]

Import

from openai.types import EvalCreateResponse

I/O Contract

Fields

Name	Type	Required	Description
id	str	Yes	Unique identifier for the evaluation
created_at	int	Yes	Unix timestamp (seconds) when the eval was created
data_source_config	DataSourceConfig	Yes	Configuration of data sources used in evaluation runs
metadata	Optional[Metadata]	No	Up to 16 key-value pairs for additional information
name	str	Yes	Name of the evaluation
object	Literal["eval"]	Yes	Object type, always "eval"
testing_criteria	List[TestingCriterion]	Yes	List of graders for the evaluation

Usage Examples

from openai import OpenAI

client = OpenAI()

response = client.evals.create(
    name="Quality Check",
    data_source_config={"type": "logs", "metadata": {"usecase": "chatbot"}},
    testing_criteria=[
        {
            "type": "label_model",
            "name": "accuracy",
            "model": "gpt-4o",
            "input": [{"role": "user", "content": "Evaluate: {{sample.output_text}}"}],
            "labels": ["pass", "fail"],
            "passing_labels": ["pass"],
        }
    ],
)
print(response.id)
print(response.name)
print(response.data_source_config.type)
for criterion in response.testing_criteria:
    print(f"  Grader: {criterion}")

Related Pages

Environment:Openai_Openai_python_Python_3_9_Plus

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment