Implementation:Elevenlabs Elevenlabs python Audio Source Pattern

Knowledge Sources	ElevenLabs Python
Domains	Speech_Recognition, Data_Preparation
Last Updated	2026-02-15 00:00 GMT

Overview

User-defined pattern for selecting and preparing audio input for speech-to-text transcription.

Description

This is a Pattern Doc. Audio source selection is a user decision, not a library API. The SDK provides three intake methods:

Batch file: Pass file=open(path, "rb") to speech_to_text.convert()
Batch URL: Pass cloud_storage_url=url to speech_to_text.convert()
Realtime audio: Pass RealtimeAudioOptions or RealtimeUrlOptions to speech_to_text.realtime.connect()

Usage

Choose the appropriate intake method based on your audio source and latency requirements.

Interface Specification

# Batch: local file
result = client.speech_to_text.convert(
    model_id="scribe_v1",
    file=open("audio.mp3", "rb"),
)

# Batch: cloud URL
result = client.speech_to_text.convert(
    model_id="scribe_v1",
    cloud_storage_url="https://storage.example.com/audio.mp3",
)

# Realtime: manual audio chunks
connection = await client.speech_to_text.realtime.connect({
    "model_id": "scribe_v2_realtime",
    "audio_format": AudioFormat.PCM_16000,
    "sample_rate": 16000,
})

# Realtime: URL streaming (uses ffmpeg)
connection = await client.speech_to_text.realtime.connect({
    "model_id": "scribe_v2_realtime",
    "url": "https://stream.example.com/live.mp3",
})

I/O Contract

Inputs

Name	Type	Required	Description
audio source	file, URL, or stream	Yes	The audio to transcribe

Outputs

Name	Type	Description
prepared input	core.File, str, or RealtimeOptions	Ready for the appropriate STT method

Related Pages

Implements Principle

Principle:Elevenlabs_Elevenlabs_python_Audio_Source_Selection

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment