Implementation:Openai Openai node Audio Resource

Knowledge Sources	Openai_Openai_node
Domains	SDK, Audio
Last Updated	2026-02-15 12:00 GMT

Overview

The Audio resource class serves as a namespace that groups together the speech, transcriptions, and translations sub-resources for the OpenAI Audio API.

Description

The Audio class extends APIResource and acts as a container that organizes the three audio-related sub-resources: Transcriptions, Translations, and Speech. It does not define any API methods of its own; instead, it instantiates each sub-resource and exposes them as properties so that callers access audio functionality through a hierarchical namespace such as client.audio.speech.create().

The class also exports two key type aliases: AudioModel, which enumerates the supported audio models (including whisper-1, gpt-4o-transcribe, gpt-4o-mini-transcribe, gpt-4o-mini-transcribe-2025-12-15, and gpt-4o-transcribe-diarize), and AudioResponseFormat, which lists the available output formats (json, text, srt, verbose_json, vtt, and diarized_json).

This file is auto-generated from the OpenAI OpenAPI specification by the Stainless code generator and re-exports all types from the child sub-resource modules for convenient access.

Usage

Use the Audio resource when you need to interact with any of the OpenAI Audio API endpoints. Rather than instantiating individual sub-resources, access them through the client.audio property, which provides .speech, .transcriptions, and .translations.

Code Reference

Source Location

Repository: openai-node
File: src/resources/audio/audio.ts

Signature

export class Audio extends APIResource {
  transcriptions: TranscriptionsAPI.Transcriptions;
  translations: TranslationsAPI.Translations;
  speech: SpeechAPI.Speech;
}

export type AudioModel =
  | 'whisper-1'
  | 'gpt-4o-transcribe'
  | 'gpt-4o-mini-transcribe'
  | 'gpt-4o-mini-transcribe-2025-12-15'
  | 'gpt-4o-transcribe-diarize';

export type AudioResponseFormat =
  | 'json' | 'text' | 'srt' | 'verbose_json' | 'vtt' | 'diarized_json';

Import

import OpenAI from 'openai';

I/O Contract

Inputs

The Audio class itself does not accept direct inputs. It delegates to its sub-resources:

Sub-Resource	Access Path	Description
Speech	`client.audio.speech`	Text-to-speech generation
Transcriptions	`client.audio.transcriptions`	Audio-to-text transcription
Translations	`client.audio.translations`	Audio translation to English

Outputs

Name	Type	Description
AudioModel	`string` union	Enum of supported audio model identifiers
AudioResponseFormat	`string` union	Enum of supported output format options

Usage Examples

import OpenAI from 'openai';

const client = new OpenAI();

// Access speech sub-resource
const speech = await client.audio.speech.create({
  input: 'Hello world',
  model: 'tts-1',
  voice: 'alloy',
});

// Access transcriptions sub-resource
const transcription = await client.audio.transcriptions.create({
  file: fs.createReadStream('audio.mp3'),
  model: 'whisper-1',
});

// Access translations sub-resource
const translation = await client.audio.translations.create({
  file: fs.createReadStream('speech.mp3'),
  model: 'whisper-1',
});

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment