Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Openai Openai python Speech Create

From Leeroopedia
Knowledge Sources
Domains Audio, Speech_Synthesis
Last Updated 2026-02-15 00:00 GMT

Overview

Concrete tool for generating speech audio from text provided by the OpenAI Python SDK.

Description

The Speech resource provides a create() method that synthesizes text into audio using OpenAI's TTS models. It supports multiple voices, output formats (MP3, Opus, AAC, FLAC, WAV, PCM), adjustable speed, and voice style instructions. The response is binary audio data that can be saved to a file or streamed for real-time playback.

Usage

Call client.audio.speech.create() with text input, model, and voice selection. Save the response to a file or use .with_streaming_response for streaming playback.

Code Reference

Source Location

  • Repository: openai-python
  • File: src/openai/resources/audio/speech.py
  • Lines: L48-122 (sync), L145-219 (async)

Signature

class Speech(SyncAPIResource):
    def create(
        self,
        *,
        input: str,
        model: Union[str, SpeechModel],
        voice: Union[str, Literal["alloy", "ash", "ballad", "coral", "echo", "sage", "shimmer", "verse", "marin", "cedar"]],
        instructions: str | NotGiven = NOT_GIVEN,
        response_format: Literal["mp3", "opus", "aac", "flac", "wav", "pcm"] | NotGiven = NOT_GIVEN,
        speed: float | NotGiven = NOT_GIVEN,
    ) -> HttpxBinaryResponseContent:
        """
        Generates audio from text input.

        Args:
            input: Text to synthesize (max 4096 characters).
            model: TTS model ("tts-1", "tts-1-hd").
            voice: Voice selection.
            instructions: Voice style instructions.
            response_format: Output format (default "mp3").
            speed: Playback speed 0.25-4.0 (default 1.0).
        """

Import

from openai import OpenAI
# Access via client.audio.speech.create()

I/O Contract

Inputs

Name Type Required Description
input str Yes Text to synthesize (max 4096 chars)
model Union[str, SpeechModel] Yes TTS model (tts-1, tts-1-hd)
voice str Yes Voice (alloy, ash, ballad, coral, echo, sage, shimmer, verse, marin, cedar)
instructions str No Voice style instructions
response_format str No Output format: mp3, opus, aac, flac, wav, pcm (default mp3)
speed float No Speed 0.25-4.0 (default 1.0)

Outputs

Name Type Description
audio HttpxBinaryResponseContent Binary audio data (streamable)

Usage Examples

Save to File

from openai import OpenAI
from pathlib import Path

client = OpenAI()
response = client.audio.speech.create(
    model="tts-1-hd",
    voice="alloy",
    input="Today is a wonderful day to build something people love!",
)
response.stream_to_file(Path("output.mp3"))

Streaming Playback

with client.audio.speech.with_streaming_response.create(
    model="tts-1",
    voice="nova",
    input="Streaming audio playback in real time.",
    response_format="pcm",
) as response:
    for chunk in response.iter_bytes(chunk_size=1024):
        play_audio(chunk)

Custom Voice Style

response = client.audio.speech.create(
    model="tts-1-hd",
    voice="shimmer",
    input="Welcome to the future of voice AI.",
    instructions="Speak in a warm, enthusiastic tone.",
    speed=0.9,
    response_format="opus",
)

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment