Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Elevenlabs Elevenlabs python StreamingAudioChunkWithTimestampsAndVoiceSegmentsResponseModel

From Leeroopedia
Field Value
source Elevenlabs_Elevenlabs_python
domains Audio, Streaming, Text-to-Speech, Timestamps, Voice Segments
last_updated 2026-02-15

Overview

Description

StreamingAudioChunkWithTimestampsAndVoiceSegmentsResponseModel is a Pydantic model representing a single streaming audio chunk that includes character-level timestamp alignments and voice segment information. It carries base64-encoded audio data along with alignment data for both the original and normalized text, and a list of voice segments. This model is structurally identical to AudioWithTimestampsAndVoiceSegmentsResponseModel but is used specifically in streaming contexts where audio is delivered incrementally. This model is auto-generated by Fern from the ElevenLabs API definition and extends UncheckedBaseModel.

Usage

This model is received as part of streaming text-to-speech responses when requesting audio with timestamps and voice segments. Each instance represents one chunk of a larger audio stream, enabling real-time playback with precise timing and voice attribution data.

Code Reference

Source Location

src/elevenlabs/types/streaming_audio_chunk_with_timestamps_and_voice_segments_response_model.py

Class Signature

class StreamingAudioChunkWithTimestampsAndVoiceSegmentsResponseModel(UncheckedBaseModel):
    ...

Import Statement

from elevenlabs.types import StreamingAudioChunkWithTimestampsAndVoiceSegmentsResponseModel

I/O Contract

Field Type Required Description
audio_base_64 str Yes Base64 encoded audio data. Serialized with alias audio_base64.
alignment Optional[CharacterAlignmentResponseModel] No Timestamp information for each character in the original text.
normalized_alignment Optional[CharacterAlignmentResponseModel] No Timestamp information for each character in the normalized text.
voice_segments List[VoiceSegment] Yes Voice segments for the audio.

Usage Examples

from elevenlabs.types import StreamingAudioChunkWithTimestampsAndVoiceSegmentsResponseModel
import base64

# Typically received in a streaming loop
for chunk in streaming_response:
    # chunk is a StreamingAudioChunkWithTimestampsAndVoiceSegmentsResponseModel
    raw_audio = base64.b64decode(chunk.audio_base_64)

    # Process alignment data if available
    if chunk.alignment:
        print("Character alignments:", chunk.alignment)

    # Process voice segments
    for segment in chunk.voice_segments:
        print("Voice segment:", segment)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment