Implementation:Openai Openai node Audio Resource
| Knowledge Sources | |
|---|---|
| Domains | SDK, Audio |
| Last Updated | 2026-02-15 12:00 GMT |
Overview
The Audio resource class serves as a namespace that groups together the speech, transcriptions, and translations sub-resources for the OpenAI Audio API.
Description
The Audio class extends APIResource and acts as a container that organizes the three audio-related sub-resources: Transcriptions, Translations, and Speech. It does not define any API methods of its own; instead, it instantiates each sub-resource and exposes them as properties so that callers access audio functionality through a hierarchical namespace such as client.audio.speech.create().
The class also exports two key type aliases: AudioModel, which enumerates the supported audio models (including whisper-1, gpt-4o-transcribe, gpt-4o-mini-transcribe, gpt-4o-mini-transcribe-2025-12-15, and gpt-4o-transcribe-diarize), and AudioResponseFormat, which lists the available output formats (json, text, srt, verbose_json, vtt, and diarized_json).
This file is auto-generated from the OpenAI OpenAPI specification by the Stainless code generator and re-exports all types from the child sub-resource modules for convenient access.
Usage
Use the Audio resource when you need to interact with any of the OpenAI Audio API endpoints. Rather than instantiating individual sub-resources, access them through the client.audio property, which provides .speech, .transcriptions, and .translations.
Code Reference
Source Location
- Repository: openai-node
- File: src/resources/audio/audio.ts
Signature
export class Audio extends APIResource {
transcriptions: TranscriptionsAPI.Transcriptions;
translations: TranslationsAPI.Translations;
speech: SpeechAPI.Speech;
}
export type AudioModel =
| 'whisper-1'
| 'gpt-4o-transcribe'
| 'gpt-4o-mini-transcribe'
| 'gpt-4o-mini-transcribe-2025-12-15'
| 'gpt-4o-transcribe-diarize';
export type AudioResponseFormat =
| 'json' | 'text' | 'srt' | 'verbose_json' | 'vtt' | 'diarized_json';
Import
import OpenAI from 'openai';
I/O Contract
Inputs
The Audio class itself does not accept direct inputs. It delegates to its sub-resources:
| Sub-Resource | Access Path | Description |
|---|---|---|
| Speech | client.audio.speech |
Text-to-speech generation |
| Transcriptions | client.audio.transcriptions |
Audio-to-text transcription |
| Translations | client.audio.translations |
Audio translation to English |
Outputs
| Name | Type | Description |
|---|---|---|
| AudioModel | string union |
Enum of supported audio model identifiers |
| AudioResponseFormat | string union |
Enum of supported output format options |
Usage Examples
import OpenAI from 'openai';
const client = new OpenAI();
// Access speech sub-resource
const speech = await client.audio.speech.create({
input: 'Hello world',
model: 'tts-1',
voice: 'alloy',
});
// Access transcriptions sub-resource
const transcription = await client.audio.transcriptions.create({
file: fs.createReadStream('audio.mp3'),
model: 'whisper-1',
});
// Access translations sub-resource
const translation = await client.audio.translations.create({
file: fs.createReadStream('speech.mp3'),
model: 'whisper-1',
});