Implementation:Openai Openai node Eval Run OutputItems
| Knowledge Sources | |
|---|---|
| Domains | SDK, Evals |
| Last Updated | 2026-02-15 12:00 GMT |
Overview
The OutputItems class is the Eval Run Output Items resource in the openai-node SDK, providing methods to retrieve and list individual output items from an evaluation run.
Description
The OutputItems class extends APIResource and wraps the /evals/{eval_id}/runs/{run_id}/output_items REST endpoints. It is accessed via client.evals.runs.outputItems and provides two methods: retrieve for fetching a single output item by ID, and list for paginating output items within a run.
Each output item (represented by OutputItemRetrieveResponse or OutputItemListResponse) contains a complete record of one data point that was evaluated during a run. The output item includes: id, created_at, datasource_item (the original input data), datasource_item_id, eval_id, run_id, object ('eval.run.output_item'), status, a results array of grader results, and a sample object.
The results array contains grader results, each with name, passed (boolean), score (numeric), optional sample data, and optional type. The sample object contains the full input/output context: input messages, output messages, model, error (an EvalAPIError), finish_reason, max_completion_tokens, seed, temperature, top_p, and usage token details.
Usage
Use this resource to inspect the detailed results of an evaluation run, including the original input data, model outputs, and grader scores for each individual data point.
Code Reference
Source Location
- Repository: openai-node
- File: src/resources/evals/runs/output-items.ts
Signature
export class OutputItems extends APIResource {
retrieve(
outputItemID: string,
params: OutputItemRetrieveParams,
options?: RequestOptions,
): APIPromise<OutputItemRetrieveResponse>;
list(
runID: string,
params: OutputItemListParams,
options?: RequestOptions,
): PagePromise<OutputItemListResponsesPage, OutputItemListResponse>;
}
Import
import OpenAI from 'openai';
// Access via client.evals.runs.outputItems
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| outputItemID (retrieve) | string |
Yes | The ID of the output item to retrieve |
| eval_id (retrieve/list) | string |
Yes | The ID of the evaluation |
| run_id (retrieve) | string |
Yes | The ID of the run |
| runID (list) | string |
Yes | The ID of the run to list output items for |
| order (list) | 'desc' | No | Sort order by timestamp (defaults to 'asc') |
| status (list) | 'pass' | No | Filter output items by pass/fail status |
| after (list) | string |
No | Cursor for pagination |
| limit (list) | number |
No | Number of items per page |
Outputs
| Name | Type | Description |
|---|---|---|
| OutputItemRetrieveResponse | OutputItemRetrieveResponse |
Single output item with id, created_at, datasource_item, eval_id, run_id, results, sample, status |
| OutputItemListResponse | OutputItemListResponse |
Paginated output item with same fields as retrieve response |
| results[].name | string |
The name of the grader that produced this result |
| results[].passed | boolean |
Whether the grader considered the output a pass |
| results[].score | number |
The numeric score produced by the grader |
| sample.input | Array<{ content: string; role: string }> |
Input messages sent to the model |
| sample.output | Array<{ content?: string; role?: string }> |
Output messages generated by the model |
| sample.usage | { cached_tokens, completion_tokens, prompt_tokens, total_tokens } |
Token usage details |
Usage Examples
import OpenAI from 'openai';
const client = new OpenAI();
const evalId = 'eval_abc123';
const runId = 'run_xyz789';
// List all output items for a run
for await (const item of client.evals.runs.outputItems.list(runId, {
eval_id: evalId,
order: 'asc',
})) {
console.log(item.id, item.status);
for (const result of item.results) {
console.log(` Grader: ${result.name}, Passed: ${result.passed}, Score: ${result.score}`);
}
}
// List only failed output items
for await (const item of client.evals.runs.outputItems.list(runId, {
eval_id: evalId,
status: 'fail',
})) {
console.log('Failed item:', item.datasource_item);
console.log('Model output:', item.sample.output);
}
// Retrieve a specific output item
const outputItem = await client.evals.runs.outputItems.retrieve('output_item_456', {
eval_id: evalId,
run_id: runId,
});
console.log(outputItem.sample.input);
console.log(outputItem.sample.output);
console.log(outputItem.results);