Implementation:Openai Openai node Translations Resource
| Knowledge Sources | |
|---|---|
| Domains | SDK, Audio, Translation |
| Last Updated | 2026-02-15 12:00 GMT |
Overview
The Translations resource class provides the create method for translating audio files into English via the OpenAI Audio Translations API.
Description
The Translations class extends APIResource and exposes a single create method with multiple overloaded signatures. It sends a multipart form POST request to the /audio/translations endpoint. The method accepts an audio file and returns a translation result whose type varies based on the requested response_format: a Translation object for json format, a TranslationVerbose object for verbose_json, or a raw string for text, srt, or vtt formats.
The Translation interface contains a simple text field with the translated text, while TranslationVerbose adds duration, language, and optional segments metadata. The TranslationCreateParams interface is generic over ResponseFormat and accepts parameters for the audio file, model, optional prompt, response format, and sampling temperature.
Currently only the whisper-1 model is supported for translations. The audio file can be in formats including flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.
Usage
Use the Translations resource when you need to translate non-English audio into English text. Access it via client.audio.translations.create() and provide an audio file along with the model identifier. Choose the response format based on whether you need plain text, JSON, subtitles (SRT/VTT), or verbose metadata.
Code Reference
Source Location
- Repository: openai-node
- File: src/resources/audio/translations.ts
Signature
export class Translations extends APIResource {
create(body: TranslationCreateParams<'json' | undefined>, options?: RequestOptions): APIPromise<Translation>;
create(body: TranslationCreateParams<'verbose_json'>, options?: RequestOptions): APIPromise<TranslationVerbose>;
create(body: TranslationCreateParams<'text' | 'srt' | 'vtt'>, options?: RequestOptions): APIPromise<string>;
create(body: TranslationCreateParams, options?: RequestOptions): APIPromise<Translation>;
}
export interface Translation {
text: string;
}
export interface TranslationVerbose {
duration: number;
language: string;
text: string;
segments?: Array<TranscriptionSegment>;
}
export interface TranslationCreateParams<ResponseFormat> {
file: Uploadable;
model: (string & {}) | AudioModel;
prompt?: string;
response_format?: 'json' | 'text' | 'srt' | 'verbose_json' | 'vtt';
temperature?: number;
}
Import
import OpenAI from 'openai';
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| file | Uploadable |
Yes | The audio file to translate (flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm) |
| model | AudioModel | Yes | Model ID to use (currently only whisper-1)
|
| prompt | string |
No | Optional text to guide style or continue a previous segment (should be in English) |
| response_format | string |
No | Output format: json, text, srt, verbose_json, or vtt
|
| temperature | number |
No | Sampling temperature between 0 and 1 |
Outputs
| Name | Type | Description |
|---|---|---|
| Translation | { text: string } |
Simple translation result (for json format)
|
| TranslationVerbose | TranslationVerbose |
Detailed result with duration, language, text, and optional segments |
| string | string |
Raw text output (for text, srt, or vtt formats)
|
Usage Examples
import OpenAI from 'openai';
import fs from 'fs';
const client = new OpenAI();
// Basic translation (returns JSON)
const translation = await client.audio.translations.create({
file: fs.createReadStream('speech.mp3'),
model: 'whisper-1',
});
console.log(translation.text);
// Verbose translation with metadata
const verbose = await client.audio.translations.create({
file: fs.createReadStream('speech.mp3'),
model: 'whisper-1',
response_format: 'verbose_json',
});
console.log(verbose.duration, verbose.language, verbose.text);
// Get SRT subtitle format
const srt = await client.audio.translations.create({
file: fs.createReadStream('speech.mp3'),
model: 'whisper-1',
response_format: 'srt',
});
console.log(srt); // raw SRT string