Implementation:Elevenlabs Elevenlabs python StreamingAudioChunkWithTimestampsAndVoiceSegmentsResponseModel

Field	Value
source	Elevenlabs_Elevenlabs_python
domains	Audio, Streaming, Text-to-Speech, Timestamps, Voice Segments
last_updated	2026-02-15

Overview

Description

StreamingAudioChunkWithTimestampsAndVoiceSegmentsResponseModel is a Pydantic model representing a single streaming audio chunk that includes character-level timestamp alignments and voice segment information. It carries base64-encoded audio data along with alignment data for both the original and normalized text, and a list of voice segments. This model is structurally identical to AudioWithTimestampsAndVoiceSegmentsResponseModel but is used specifically in streaming contexts where audio is delivered incrementally. This model is auto-generated by Fern from the ElevenLabs API definition and extends UncheckedBaseModel.

Usage

This model is received as part of streaming text-to-speech responses when requesting audio with timestamps and voice segments. Each instance represents one chunk of a larger audio stream, enabling real-time playback with precise timing and voice attribution data.

Code Reference

Source Location

src/elevenlabs/types/streaming_audio_chunk_with_timestamps_and_voice_segments_response_model.py

Class Signature

class StreamingAudioChunkWithTimestampsAndVoiceSegmentsResponseModel(UncheckedBaseModel):
    ...

Import Statement

from elevenlabs.types import StreamingAudioChunkWithTimestampsAndVoiceSegmentsResponseModel

I/O Contract

Field	Type	Required	Description
audio_base_64	`str`	Yes	Base64 encoded audio data. Serialized with alias `audio_base64`.
alignment	`Optional[CharacterAlignmentResponseModel]`	No	Timestamp information for each character in the original text.
normalized_alignment	`Optional[CharacterAlignmentResponseModel]`	No	Timestamp information for each character in the normalized text.
voice_segments	`List[VoiceSegment]`	Yes	Voice segments for the audio.

Usage Examples

from elevenlabs.types import StreamingAudioChunkWithTimestampsAndVoiceSegmentsResponseModel
import base64

# Typically received in a streaming loop
for chunk in streaming_response:
    # chunk is a StreamingAudioChunkWithTimestampsAndVoiceSegmentsResponseModel
    raw_audio = base64.b64decode(chunk.audio_base_64)

    # Process alignment data if available
    if chunk.alignment:
        print("Character alignments:", chunk.alignment)

    # Process voice segments
    for segment in chunk.voice_segments:
        print("Voice segment:", segment)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment