Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Openai Openai node Eval Runs Resource

From Leeroopedia
Knowledge Sources
Domains SDK, Evals
Last Updated 2026-02-15 12:00 GMT

Overview

The Runs class is the Eval Runs resource in the openai-node SDK, providing methods to create, retrieve, list, delete, and cancel evaluation runs that execute an evaluation against specific data sources and model configurations.

Description

The Runs class extends APIResource and wraps the /evals/{evalID}/runs REST endpoints. It is accessed via client.evals.runs and provides five methods: create for kicking off a new run, retrieve for fetching a run by ID, list for paginating runs within an evaluation, delete for removing a run, and cancel for stopping an ongoing run.

The create method accepts a RunCreateParams body with a required data_source field. The data source can be a CreateEvalJSONLRunDataSource (inline or file-based JSONL data), a CreateEvalCompletionsRunDataSource (completions-based with model sampling configuration), or a CreateEvalResponsesRunDataSource (responses-based with model sampling configuration). Completions and Responses data sources support input_messages (either a template with variable references or an item reference), a model name, and sampling_params (max_completion_tokens, reasoning_effort, seed, temperature, top_p, text format config, and tools).

The resource exposes an outputItems sub-resource (of type OutputItems) for inspecting individual output items within a run. Run response objects (RunCreateResponse, RunRetrieveResponse, RunListResponse) include fields such as id, created_at, eval_id, metadata, model, name, status (queued, in_progress, completed, canceled, failed), data_source, error, per_grader_metrics, per_model_usage, result_counts, and report_url.

Usage

Use this resource to execute evaluations against specific data and model configurations. After creating an eval via client.evals.create, use client.evals.runs.create to run the evaluation with your chosen data source and model, then inspect results with the outputItems sub-resource.

Code Reference

Source Location

Signature

export class Runs extends APIResource {
  outputItems: OutputItemsAPI.OutputItems;

  create(
    evalID: string,
    body: RunCreateParams,
    options?: RequestOptions,
  ): APIPromise<RunCreateResponse>;

  retrieve(
    runID: string,
    params: RunRetrieveParams,
    options?: RequestOptions,
  ): APIPromise<RunRetrieveResponse>;

  list(
    evalID: string,
    query?: RunListParams | null,
    options?: RequestOptions,
  ): PagePromise<RunListResponsesPage, RunListResponse>;

  delete(
    runID: string,
    params: RunDeleteParams,
    options?: RequestOptions,
  ): APIPromise<RunDeleteResponse>;

  cancel(
    runID: string,
    params: RunCancelParams,
    options?: RequestOptions,
  ): APIPromise<RunCancelResponse>;
}

Import

import OpenAI from 'openai';
// Access via client.evals.runs

I/O Contract

Inputs

Name Type Required Description
evalID (create/list) string Yes The ID of the evaluation to run against
data_source (create) CreateEvalCompletionsRunDataSource | CreateEvalResponsesRunDataSource Yes Data source configuration: JSONL data, completions-based, or responses-based
name (create) string No Optional name for the run
metadata (create) null No Up to 16 key-value pairs for structured storage
runID (retrieve/delete/cancel) string Yes The ID of the run
eval_id (retrieve/delete/cancel) string Yes The ID of the evaluation (passed in params)
order (list) 'desc' No Sort order by timestamp (defaults to 'asc')
status (list) 'in_progress' | 'completed' | 'canceled' | 'failed' No Filter runs by status

Outputs

Name Type Description
RunCreateResponse RunCreateResponse Created run with id, created_at, eval_id, metadata, model, name, status, data_source, per_grader_metrics, per_model_usage, result_counts, report_url, error
RunRetrieveResponse RunRetrieveResponse Retrieved run with same structure
RunListResponse RunListResponse Paginated list item with same structure
RunDeleteResponse RunDeleteResponse Object with deleted (boolean), object, and run_id
RunCancelResponse RunCancelResponse Canceled run with same structure as create response

Usage Examples

import OpenAI from 'openai';

const client = new OpenAI();

const evalId = 'eval_abc123';

// Create a run with JSONL inline data
const run = await client.evals.runs.create(evalId, {
  name: 'Test Run v1',
  data_source: {
    type: 'jsonl',
    source: {
      type: 'file_content',
      content: [
        { item: { question: 'What is 2+2?', expected_answer: '4' } },
        { item: { question: 'Capital of France?', expected_answer: 'Paris' } },
      ],
    },
  },
});
console.log(run.id, run.status);

// Create a run with completions data source
const completionsRun = await client.evals.runs.create(evalId, {
  data_source: {
    type: 'completions',
    source: {
      type: 'file_content',
      content: [
        { item: { question: 'Explain gravity.' } },
      ],
    },
    model: 'gpt-4o',
    input_messages: {
      type: 'template',
      template: [
        { role: 'user', content: '{{item.question}}' },
      ],
    },
  },
});

// Retrieve a run
const retrieved = await client.evals.runs.retrieve(run.id, {
  eval_id: evalId,
});
console.log(retrieved.status, retrieved.result_counts);

// List runs for an evaluation
for await (const r of client.evals.runs.list(evalId, { order: 'desc' })) {
  console.log(r.id, r.status, r.name);
}

// Cancel an ongoing run
const canceled = await client.evals.runs.cancel(run.id, {
  eval_id: evalId,
});

// Delete a run
const deleted = await client.evals.runs.delete(run.id, {
  eval_id: evalId,
});

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment