Implementation:Elevenlabs Elevenlabs python Alignment
| Attribute | Value |
|---|---|
| Sources | src/elevenlabs/types/alignment.py
|
| Domains | Audio Alignment, Text-to-Speech, Timing |
| Last Updated | 2026-02-15 |
Overview
Description
The Alignment model provides alignment information that maps generated audio to its input text sequence in the ElevenLabs SDK. It contains three parallel lists: character start times, character durations, and the characters themselves. Together, these arrays enable precise character-level timing synchronization between text and audio output. This is particularly useful for applications requiring lip-sync, subtitle generation, or karaoke-style text highlighting.
Note: The timing fields use camelCase JSON aliases (charStartTimesMs, charDurationsMs) mapped to snake_case Python attributes via FieldMetadata(alias=...).
Usage
The Alignment model is returned as part of text-to-speech responses when alignment information is requested. The timing values are relative to the returned audio chunk from the model, not the full audio response. All three lists (char_start_times_ms, char_durations_ms, chars) share the same length and are positionally correlated.
Code Reference
Source Location
src/elevenlabs/types/alignment.py
Class Signature
class Alignment(UncheckedBaseModel):
"""
Alignment information for the generated audio given the input text sequence.
"""
...
Import Statement
from elevenlabs.types import Alignment
Base Class
UncheckedBaseModel (from elevenlabs.core.unchecked_base_model)
I/O Contract
| Field | Type | Required | JSON Alias | Description |
|---|---|---|---|---|
char_start_times_ms |
Optional[List[int]] |
No | charStartTimesMs |
A list of starting times (in milliseconds) for each character in the text as it corresponds to the audio. Times are relative to the returned chunk. |
char_durations_ms |
Optional[List[int]] |
No | charDurationsMs |
A list of durations (in milliseconds) for each character in the text as it corresponds to the audio. Times are relative to the returned chunk. |
chars |
Optional[List[str]] |
No | (none) | A list of characters in the text sequence. May contain spaces, punctuation, and special characters. Length matches charStartTimesMs and charDurationsMs.
|
Usage Examples
from elevenlabs import ElevenLabs
client = ElevenLabs(api_key="your_api_key")
# Generate speech with alignment data
response = client.text_to_speech.convert(
voice_id="voice_abc123",
text="Hello world",
output_format="mp3_44100_128",
with_timestamps=True
)
# Access alignment information from the response
if hasattr(response, 'alignment') and response.alignment:
alignment = response.alignment
# Iterate through character-level timing
if alignment.chars and alignment.char_start_times_ms and alignment.char_durations_ms:
for char, start_ms, duration_ms in zip(
alignment.chars,
alignment.char_start_times_ms,
alignment.char_durations_ms
):
end_ms = start_ms + duration_ms
print(f"'{char}': {start_ms}ms - {end_ms}ms")
Related Pages
- ForcedAlignmentCharacterResponseModel - Character-level forced alignment timing
- ForcedAlignmentWordResponseModel - Word-level forced alignment timing