Implementation:Openai Openai node Alpha Graders
| Knowledge Sources | |
|---|---|
| Domains | SDK, Fine_Tuning, Graders |
| Last Updated | 2026-02-15 12:00 GMT |
Overview
The Graders resource class provides methods for running and validating graders used in fine-tuning evaluation via the OpenAI Fine-Tuning Alpha API.
Description
The Graders class extends APIResource and exposes two methods: run and validate. The run method sends a POST request to /fine_tuning/alpha/graders/run to execute a grader against a model sample, returning a GraderRunResponse with the computed reward score, sub-rewards, token usage per model, and detailed metadata. The validate method sends a POST to /fine_tuning/alpha/graders/validate to check whether a grader configuration is valid, returning a GraderValidateResponse.
Both methods accept grader configurations that can be one of five types: StringCheckGrader (exact or pattern-based string matching), TextSimilarityGrader (semantic text comparison), PythonGrader (custom Python evaluation scripts), ScoreModelGrader (model-based scoring), or MultiGrader (combining multiple graders). These grader types are imported from the grader-models module.
The GraderRunResponse metadata includes detailed error tracking fields such as formula_parse_error, model_grader_parse_error, model_grader_refusal_error, python_grader_runtime_error, and several other error categories, enabling fine-grained debugging of grader execution issues.
Usage
Use the Graders resource when developing and testing evaluation criteria for fine-tuning jobs. Call run to test a grader against a specific model sample to see the reward score, and call validate to verify a grader definition is well-formed before using it in a fine-tuning job. Access via client.fineTuning.alpha.graders.
Code Reference
Source Location
- Repository: openai-node
- File: src/resources/fine-tuning/alpha/graders.ts
Signature
export class Graders extends APIResource {
run(body: GraderRunParams, options?: RequestOptions): APIPromise<GraderRunResponse>;
validate(body: GraderValidateParams, options?: RequestOptions): APIPromise<GraderValidateResponse>;
}
export interface GraderRunResponse {
metadata: GraderRunResponse.Metadata;
model_grader_token_usage_per_model: { [key: string]: unknown };
reward: number;
sub_rewards: { [key: string]: unknown };
}
export interface GraderRunParams {
grader: StringCheckGrader | TextSimilarityGrader | PythonGrader
| ScoreModelGrader | MultiGrader;
model_sample: string;
item?: unknown;
}
export interface GraderValidateParams {
grader: StringCheckGrader | TextSimilarityGrader | PythonGrader
| ScoreModelGrader | MultiGrader;
}
export interface GraderValidateResponse {
grader?: StringCheckGrader | TextSimilarityGrader | PythonGrader
| ScoreModelGrader | MultiGrader;
}
Import
import OpenAI from 'openai';
I/O Contract
Inputs
run:
| Name | Type | Required | Description |
|---|---|---|---|
| grader | GraderUnion |
Yes | The grader configuration (StringCheckGrader, TextSimilarityGrader, PythonGrader, ScoreModelGrader, or MultiGrader) |
| model_sample | string |
Yes | The model output to evaluate; populates the sample namespace
|
| item | unknown |
No | The dataset item provided to the grader; populates the item namespace
|
validate:
| Name | Type | Required | Description |
|---|---|---|---|
| grader | GraderUnion |
Yes | The grader configuration to validate |
Outputs
run:
| Name | Type | Description |
|---|---|---|
| reward | number |
The computed reward score from grader evaluation |
| sub_rewards | object |
Individual sub-reward scores from each grader component |
| model_grader_token_usage_per_model | object |
Token usage breakdown by model for model-based graders |
| metadata | GraderRunResponse.Metadata |
Execution details including errors, timing, scores, and token usage |
validate:
| Name | Type | Description |
|---|---|---|
| grader | undefined | The validated grader configuration (if valid) |
Usage Examples
import OpenAI from 'openai';
const client = new OpenAI();
// Run a string check grader
const runResult = await client.fineTuning.alpha.graders.run({
grader: {
input: 'input',
name: 'exact_match',
operation: 'eq',
reference: 'reference',
type: 'string_check',
},
model_sample: 'The model output to evaluate',
});
console.log('Reward:', runResult.reward);
console.log('Execution time:', runResult.metadata.execution_time);
// Validate a grader configuration
const validateResult = await client.fineTuning.alpha.graders.validate({
grader: {
input: 'input',
name: 'exact_match',
operation: 'eq',
reference: 'reference',
type: 'string_check',
},
});
console.log('Grader valid:', !!validateResult.grader);