Implementation:Openai Openai node Speech Resource
| Knowledge Sources | |
|---|---|
| Domains | SDK, Audio, Text_to_Speech |
| Last Updated | 2026-02-15 12:00 GMT |
Overview
The Speech resource class provides the create method for generating audio from input text via the OpenAI text-to-speech API.
Description
The Speech class extends APIResource and exposes a single create method that sends a POST request to the /audio/speech endpoint. The method accepts a SpeechCreateParams body and returns a binary Response object wrapped in an APIPromise. The response contains raw audio data in the requested format (defaulting to MP3), which can be consumed as a blob, array buffer, or streamed.
The class supports multiple TTS models including tts-1, tts-1-hd, gpt-4o-mini-tts, and gpt-4o-mini-tts-2025-12-15. It offers a selection of built-in voices such as alloy, ash, ballad, coral, echo, sage, shimmer, verse, marin, and cedar. Additional parameters allow customization of output format, playback speed, voice instructions (for newer models), and stream format.
The request sets the Accept: application/octet-stream header and uses the __binaryResponse flag to indicate that the response should be treated as binary data rather than parsed as JSON.
Usage
Use the Speech resource when you need to convert text to spoken audio. Access it via client.audio.speech.create() and provide the input text, a TTS model, and a voice selection. The resulting binary response can be saved to a file or streamed to an audio player.
Code Reference
Source Location
- Repository: openai-node
- File: src/resources/audio/speech.ts
Signature
export class Speech extends APIResource {
create(body: SpeechCreateParams, options?: RequestOptions): APIPromise<Response>;
}
export type SpeechModel = 'tts-1' | 'tts-1-hd' | 'gpt-4o-mini-tts' | 'gpt-4o-mini-tts-2025-12-15';
export interface SpeechCreateParams {
input: string;
model: (string & {}) | SpeechModel;
voice: (string & {}) | 'alloy' | 'ash' | 'ballad' | 'coral' | 'echo'
| 'sage' | 'shimmer' | 'verse' | 'marin' | 'cedar';
instructions?: string;
response_format?: 'mp3' | 'opus' | 'aac' | 'flac' | 'wav' | 'pcm';
speed?: number;
stream_format?: 'sse' | 'audio';
}
Import
import OpenAI from 'openai';
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| input | string |
Yes | The text to generate audio for (max 4096 characters) |
| model | SpeechModel | Yes | TTS model to use (e.g., tts-1, tts-1-hd, gpt-4o-mini-tts)
|
| voice | VoiceUnion | Yes | The voice to use (e.g., alloy, ash, coral, shimmer)
|
| instructions | string |
No | Additional voice control instructions (not supported for tts-1 or tts-1-hd)
|
| response_format | string |
No | Audio output format: mp3, opus, aac, flac, wav, or pcm
|
| speed | number |
No | Playback speed from 0.25 to 4.0 (default 1.0) |
| stream_format | string |
No | Stream format: sse or audio
|
Outputs
| Name | Type | Description |
|---|---|---|
| Response | Response |
A binary HTTP Response object containing the generated audio data |
Usage Examples
import OpenAI from 'openai';
const client = new OpenAI();
const speech = await client.audio.speech.create({
input: 'Today is a wonderful day to build something people love!',
model: 'tts-1',
voice: 'alloy',
});
// Get as a Blob
const blob = await speech.blob();
console.log('Audio blob size:', blob.size);
// Or save to file in Node.js
const buffer = Buffer.from(await speech.arrayBuffer());
await fs.promises.writeFile('output.mp3', buffer);