Implementation:Openai Openai python Speech Create
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Audio, Speech_Synthesis |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
Concrete tool for generating speech audio from text provided by the OpenAI Python SDK.
Description
The Speech resource provides a create() method that synthesizes text into audio using OpenAI's TTS models. It supports multiple voices, output formats (MP3, Opus, AAC, FLAC, WAV, PCM), adjustable speed, and voice style instructions. The response is binary audio data that can be saved to a file or streamed for real-time playback.
Usage
Call client.audio.speech.create() with text input, model, and voice selection. Save the response to a file or use .with_streaming_response for streaming playback.
Code Reference
Source Location
- Repository: openai-python
- File: src/openai/resources/audio/speech.py
- Lines: L48-122 (sync), L145-219 (async)
Signature
class Speech(SyncAPIResource):
def create(
self,
*,
input: str,
model: Union[str, SpeechModel],
voice: Union[str, Literal["alloy", "ash", "ballad", "coral", "echo", "sage", "shimmer", "verse", "marin", "cedar"]],
instructions: str | NotGiven = NOT_GIVEN,
response_format: Literal["mp3", "opus", "aac", "flac", "wav", "pcm"] | NotGiven = NOT_GIVEN,
speed: float | NotGiven = NOT_GIVEN,
) -> HttpxBinaryResponseContent:
"""
Generates audio from text input.
Args:
input: Text to synthesize (max 4096 characters).
model: TTS model ("tts-1", "tts-1-hd").
voice: Voice selection.
instructions: Voice style instructions.
response_format: Output format (default "mp3").
speed: Playback speed 0.25-4.0 (default 1.0).
"""
Import
from openai import OpenAI
# Access via client.audio.speech.create()
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| input | str | Yes | Text to synthesize (max 4096 chars) |
| model | Union[str, SpeechModel] | Yes | TTS model (tts-1, tts-1-hd) |
| voice | str | Yes | Voice (alloy, ash, ballad, coral, echo, sage, shimmer, verse, marin, cedar) |
| instructions | str | No | Voice style instructions |
| response_format | str | No | Output format: mp3, opus, aac, flac, wav, pcm (default mp3) |
| speed | float | No | Speed 0.25-4.0 (default 1.0) |
Outputs
| Name | Type | Description |
|---|---|---|
| audio | HttpxBinaryResponseContent | Binary audio data (streamable) |
Usage Examples
Save to File
from openai import OpenAI
from pathlib import Path
client = OpenAI()
response = client.audio.speech.create(
model="tts-1-hd",
voice="alloy",
input="Today is a wonderful day to build something people love!",
)
response.stream_to_file(Path("output.mp3"))
Streaming Playback
with client.audio.speech.with_streaming_response.create(
model="tts-1",
voice="nova",
input="Streaming audio playback in real time.",
response_format="pcm",
) as response:
for chunk in response.iter_bytes(chunk_size=1024):
play_audio(chunk)
Custom Voice Style
response = client.audio.speech.create(
model="tts-1-hd",
voice="shimmer",
input="Welcome to the future of voice AI.",
instructions="Speak in a warm, enthusiastic tone.",
speed=0.9,
response_format="opus",
)
Related Pages
Implements Principle
Requires Environment
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment