Implementation:Elevenlabs Elevenlabs python RealtimeTextToSpeechClient Convert Realtime

Knowledge Sources	ElevenLabs Python ElevenLabs WebSocket TTS
Domains	Speech_Synthesis, Streaming, WebSocket
Last Updated	2026-02-15 00:00 GMT

Overview

Concrete tool for streaming text-to-speech synthesis over WebSocket provided by the elevenlabs-python SDK.

Description

The RealtimeTextToSpeechClient.convert_realtime method opens a synchronous WebSocket connection to the ElevenLabs streaming TTS endpoint and processes text chunks in real time. It extends TextToSpeechClient to add WebSocket capabilities. Internally, it uses the text_chunker utility to buffer text at sentence boundaries before sending, and performs non-blocking receives (10ms timeout) between sends to yield audio as early as possible.

The method is a Python generator that yields base64-decoded audio byte chunks. After all text is sent, it sends an empty-text flush signal and performs a blocking drain to collect remaining audio until the server closes the connection (code 1000).

Usage

Use this method when you have a streaming text source (Iterator[str]) such as output from an LLM, and need real-time audio generation. The method returns an Iterator[bytes] that can be passed directly to play(), save(), or stream() utilities.

Code Reference

Source Location

Repository: elevenlabs-python
File: src/elevenlabs/realtime_tts.py
Lines: L42-145

Signature

class RealtimeTextToSpeechClient(TextToSpeechClient):
    def convert_realtime(
        self,
        voice_id: str,
        *,
        text: typing.Iterator[str],
        model_id: typing.Optional[str] = OMIT,
        output_format: typing.Optional[OutputFormat] = "mp3_44100_128",
        voice_settings: typing.Optional[VoiceSettings] = OMIT,
        request_options: typing.Optional[RequestOptions] = None,
    ) -> typing.Iterator[bytes]:
        """
        Converts streaming text into speech using a voice and returns
        audio chunks via WebSocket.

        Args:
            voice_id: Voice ID to use for synthesis.
            text: Iterator of text strings to synthesize.
            model_id: TTS model identifier.
            output_format: Audio format (default "mp3_44100_128").
            voice_settings: Override stability/similarity/style settings.
            request_options: Additional request headers.
        """

Import

from elevenlabs import ElevenLabs

client = ElevenLabs()
# Access via: client.text_to_speech.convert_realtime(...)

I/O Contract

Inputs

Name	Type	Required	Description
voice_id	str	Yes	Voice ID to use for synthesis
text	Iterator[str]	Yes	Streaming text input (generator or iterator)
model_id	Optional[str]	No	TTS model identifier
output_format	Optional[OutputFormat]	No	Audio encoding format (default "mp3_44100_128")
voice_settings	Optional[VoiceSettings]	No	Override stability, similarity_boost, style, use_speaker_boost

Outputs

Name	Type	Description
(return)	Iterator[bytes]	Streaming base64-decoded audio byte chunks

Usage Examples

Basic Realtime TTS

from elevenlabs import ElevenLabs, stream

client = ElevenLabs()

def text_stream():
    yield "Hello, "
    yield "how are you "
    yield "doing today?"

audio = client.text_to_speech.convert_realtime(
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    text=text_stream(),
    model_id="eleven_multilingual_v2",
)

# Stream plays audio progressively as it generates
stream(audio)

With LLM Output Stream

from elevenlabs import ElevenLabs, VoiceSettings, stream

client = ElevenLabs()

def get_llm_response():
    """Simulate streaming LLM output."""
    # In practice, yield chunks from OpenAI/Anthropic streaming API
    for word in "The quick brown fox jumps over the lazy dog".split():
        yield word + " "

audio = client.text_to_speech.convert_realtime(
    voice_id="JBFqnCBsd6RMkjVDRZzb",
    text=get_llm_response(),
    model_id="eleven_turbo_v2_5",
    voice_settings=VoiceSettings(
        stability=0.5,
        similarity_boost=0.75,
        style=0.0,
        use_speaker_boost=True,
    ),
)

full_audio = stream(audio)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment