Implementation:Openai Openai python Eval Create Response
| Knowledge Sources | |
|---|---|
| Domains | API_Types, Python |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
Concrete type for the evaluation creation response object provided by the openai-python SDK.
Description
The EvalCreateResponse Pydantic model represents an Eval object returned after creating an evaluation. An Eval represents a task for your LLM integration, such as improving chatbot quality or comparing models. It contains an id, created_at timestamp, data_source_config (a discriminated union of EvalCustomDataSourceConfig, DataSourceConfigLogs, or EvalStoredCompletionsDataSourceConfig), optional metadata, a name, an object type fixed to "eval", and a testing_criteria list of grader variants (LabelModelGrader, StringCheckGrader, TextSimilarityGrader, PythonGrader, or ScoreModelGrader with pass_threshold extensions).
Usage
Import this type when inspecting the return value of client.evals.create().
Code Reference
Source Location
- Repository: openai-python
- File: src/openai/types/eval_create_response.py
Signature
class EvalCreateResponse(BaseModel):
id: str
created_at: int
data_source_config: DataSourceConfig
metadata: Optional[Metadata] = None
name: str
object: Literal["eval"]
testing_criteria: List[TestingCriterion]
DataSourceConfig = Annotated[
Union[EvalCustomDataSourceConfig, DataSourceConfigLogs, EvalStoredCompletionsDataSourceConfig],
PropertyInfo(discriminator="type"),
]
TestingCriterion = Union[
LabelModelGrader, StringCheckGrader,
TestingCriterionEvalGraderTextSimilarity,
TestingCriterionEvalGraderPython,
TestingCriterionEvalGraderScoreModel,
]
Import
from openai.types import EvalCreateResponse
I/O Contract
Fields
| Name | Type | Required | Description |
|---|---|---|---|
| id | str | Yes | Unique identifier for the evaluation |
| created_at | int | Yes | Unix timestamp (seconds) when the eval was created |
| data_source_config | DataSourceConfig | Yes | Configuration of data sources used in evaluation runs |
| metadata | Optional[Metadata] | No | Up to 16 key-value pairs for additional information |
| name | str | Yes | Name of the evaluation |
| object | Literal["eval"] | Yes | Object type, always "eval" |
| testing_criteria | List[TestingCriterion] | Yes | List of graders for the evaluation |
Usage Examples
from openai import OpenAI
client = OpenAI()
response = client.evals.create(
name="Quality Check",
data_source_config={"type": "logs", "metadata": {"usecase": "chatbot"}},
testing_criteria=[
{
"type": "label_model",
"name": "accuracy",
"model": "gpt-4o",
"input": [{"role": "user", "content": "Evaluate: {{sample.output_text}}"}],
"labels": ["pass", "fail"],
"passing_labels": ["pass"],
}
],
)
print(response.id)
print(response.name)
print(response.data_source_config.type)
for criterion in response.testing_criteria:
print(f" Grader: {criterion}")