Implementation:Elevenlabs Elevenlabs python ScribeRealtime Connect
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Speech_Recognition, Streaming, WebSocket |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
Concrete tool for creating real-time speech-to-text WebSocket connections provided by the elevenlabs-python SDK.
Description
The ScribeRealtime.connect method establishes an async WebSocket connection to the ElevenLabs realtime STT endpoint. It supports two modes:
- Audio mode (RealtimeAudioOptions): Client sends audio chunks manually via connection.send()
- URL mode (RealtimeUrlOptions): Audio is streamed from a URL via ffmpeg subprocess
The method returns a RealtimeConnection instance that provides event-driven transcription through the .on() method for registering event callbacks.
Usage
Use this method via client.speech_to_text.realtime.connect(options). The method is async and returns a RealtimeConnection. Register event handlers with .on() before sending audio data.
Code Reference
Source Location
- Repository: elevenlabs-python
- File: src/elevenlabs/realtime/scribe.py
- Lines: L97-185 (ScribeRealtime class)
Signature
class ScribeRealtime:
def __init__(self, api_key: str, base_url: str = "wss://api.elevenlabs.io"):
"""
Args:
api_key: ElevenLabs API key.
base_url: WebSocket base URL.
"""
async def connect(
self,
options: typing.Union[RealtimeAudioOptions, RealtimeUrlOptions],
) -> RealtimeConnection:
"""Create a realtime transcription connection.
Args:
options: RealtimeAudioOptions for manual audio or
RealtimeUrlOptions for URL streaming.
Returns:
RealtimeConnection ready for event registration and audio sending.
Raises:
ValueError: If invalid options provided.
RuntimeError: If ffmpeg unavailable (URL mode).
"""
Import
from elevenlabs import ElevenLabs
client = ElevenLabs()
# Access via: await client.speech_to_text.realtime.connect(...)
I/O Contract
Inputs (RealtimeAudioOptions)
| Name | Type | Required | Description |
|---|---|---|---|
| model_id | str | Yes | STT model (e.g., "scribe_v2_realtime") |
| audio_format | AudioFormat | Yes | PCM format (e.g., AudioFormat.PCM_16000) |
| sample_rate | int | Yes | Sample rate in Hz |
| commit_strategy | CommitStrategy | No | VAD (auto) or MANUAL (default MANUAL) |
| vad_silence_threshold_secs | float | No | Silence threshold for VAD (0.3-3.0) |
| vad_threshold | float | No | Voice activity threshold (0.1-0.9) |
| language_code | str | No | ISO language code hint |
| include_timestamps | bool | No | Include word timestamps in committed transcripts |
Inputs (RealtimeUrlOptions)
| Name | Type | Required | Description |
|---|---|---|---|
| model_id | str | Yes | STT model (e.g., "scribe_v2_realtime") |
| url | str | Yes | Audio stream URL (ffmpeg handles conversion) |
| commit_strategy | CommitStrategy | No | VAD or MANUAL (default MANUAL) |
| language_code | str | No | ISO language code hint |
Outputs
| Name | Type | Description |
|---|---|---|
| (return) | RealtimeConnection | WebSocket connection with .on(), .send(), .commit(), .close() methods |
Usage Examples
Manual Audio Chunks
import asyncio
from elevenlabs import ElevenLabs
from elevenlabs.realtime.scribe import AudioFormat, CommitStrategy
from elevenlabs.realtime.connection import RealtimeEvents
async def transcribe_audio():
client = ElevenLabs()
connection = await client.speech_to_text.realtime.connect({
"model_id": "scribe_v2_realtime",
"audio_format": AudioFormat.PCM_16000,
"sample_rate": 16000,
"commit_strategy": CommitStrategy.VAD,
})
connection.on(
RealtimeEvents.PARTIAL_TRANSCRIPT,
lambda data: print(f"Partial: {data.get('transcript', '')}")
)
connection.on(
RealtimeEvents.COMMITTED_TRANSCRIPT,
lambda data: print(f"Final: {data.get('transcript', '')}")
)
# Send audio chunks (base64-encoded PCM)
import base64
with open("audio.pcm", "rb") as f:
while chunk := f.read(8192):
await connection.send({
"audio_base_64": base64.b64encode(chunk).decode()
})
await connection.close()
asyncio.run(transcribe_audio())
URL-Based Streaming
import asyncio
from elevenlabs import ElevenLabs
from elevenlabs.realtime.connection import RealtimeEvents
async def transcribe_url():
client = ElevenLabs()
connection = await client.speech_to_text.realtime.connect({
"model_id": "scribe_v2_realtime",
"url": "https://example.com/audio-stream.mp3",
})
connection.on(
RealtimeEvents.COMMITTED_TRANSCRIPT,
lambda data: print(f"Transcript: {data.get('transcript', '')}")
)
connection.on(
RealtimeEvents.ERROR,
lambda data: print(f"Error: {data}")
)
# Wait for transcription to complete
await asyncio.sleep(60)
await connection.close()
asyncio.run(transcribe_url())
Related Pages
Implements Principle
Requires Environment
- Environment:Elevenlabs_Elevenlabs_python_Python_Websockets
- Environment:Elevenlabs_Elevenlabs_python_FFmpeg_Mpv
Uses Heuristic
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment