Principle:Elevenlabs Elevenlabs python Realtime Speech to Text

Knowledge Sources	ElevenLabs Python ElevenLabs Realtime STT
Domains	Speech_Recognition, Streaming, WebSocket
Last Updated	2026-02-15 00:00 GMT

Overview

A streaming transcription technique that converts audio input to text in real time over a WebSocket connection, providing both partial (interim) and committed (final) transcript results.

Description

Realtime Speech-to-Text enables live transcription of audio as it is captured. Unlike batch STT which requires the complete audio file, realtime STT processes audio chunks as they arrive and provides two types of results:

Partial transcripts: Interim results that update as more audio arrives (useful for live captions)
Committed transcripts: Finalized results after a speech segment ends (triggered by VAD or manual commit)

The system supports two input modes:

Manual audio chunks: Client sends base64-encoded PCM audio directly
URL streaming: Client provides an audio URL and ffmpeg handles conversion and streaming

Commit strategy can be VAD (voice activity detection - automatic) or MANUAL (client decides when to commit).

Usage

Use this principle when you need live transcription of ongoing audio, such as live captioning, real-time translation, voice command detection, live meeting transcription, or streaming audio analysis.

Theoretical Basis

Realtime STT uses an incremental decoding approach:

# Abstract streaming STT pipeline
ws = connect(stt_endpoint, model_id, audio_format)

for audio_chunk in audio_source:
    ws.send(audio_chunk)
    # Server emits partial_transcript events as audio is processed
    # Server emits committed_transcript when speech segment ends (VAD)

# Manual commit (if using MANUAL strategy):
ws.send(commit_signal)
# Server emits committed_transcript with final result

Key tradeoffs:

Partial transcripts are fast but may change as more context arrives
Committed transcripts are final and more accurate but have higher latency
VAD commit is automatic but adds silence detection latency
Manual commit gives client control but requires explicit segmentation

Related Pages

Implemented By

Implementation:Elevenlabs_Elevenlabs_python_ScribeRealtime_Connect

Uses Heuristic

Heuristic:Elevenlabs_Elevenlabs_python_VAD_vs_Manual_Commit_Strategy

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment