Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Elevenlabs Elevenlabs python Text Chunker

From Leeroopedia
Knowledge Sources
Domains NLP, Streaming, Text_Processing
Last Updated 2026-02-15 00:00 GMT

Overview

Concrete tool for buffering and splitting streaming text at sentence boundaries provided by the elevenlabs-python SDK.

Description

The text_chunker function is a Python generator that takes an iterator of text fragments and yields sentence-boundary-aligned chunks. It uses a buffer to accumulate incoming text and emits chunks when it detects any of 15 splitter characters. Each emitted chunk is guaranteed to end with a space character for clean concatenation.

The function is used internally by convert_realtime to preprocess text before sending over WebSocket, but is also exported for standalone use in custom streaming pipelines.

Usage

Use this function when building a custom streaming TTS pipeline where you need to preprocess text fragments into sentence-aligned chunks before sending to the WebSocket API.

Code Reference

Source Location

Signature

def text_chunker(chunks: typing.Iterator[str]) -> typing.Iterator[str]:
    """Used during input streaming to chunk text blocks and set last char to space.

    Splits text at sentence boundaries defined by:
    (".", ",", "?", "!", ";", ":", "—", "-", "(", ")", "[", "]", "}", " ")

    Args:
        chunks: Iterator of text fragments (e.g., from LLM stream).

    Yields:
        str: Sentence-boundary-aligned text chunks, each ending with a space.
    """

Import

from elevenlabs.realtime_tts import text_chunker

I/O Contract

Inputs

Name Type Required Description
chunks Iterator[str] Yes Stream of text fragments (words, tokens, partial sentences)

Outputs

Name Type Description
(yields) Iterator[str] Sentence-boundary-aligned chunks, each ending with a space

Usage Examples

Basic Usage

from elevenlabs.realtime_tts import text_chunker

def llm_tokens():
    """Simulate LLM token stream."""
    tokens = ["Hello", ", ", "how ", "are ", "you", "? ", "I'm ", "fine", "."]
    for token in tokens:
        yield token

for chunk in text_chunker(llm_tokens()):
    print(repr(chunk))
# Output:
# 'Hello, '
# 'how are you? '
# "I'm fine. "

Custom Streaming Pipeline

import json
import websockets
from elevenlabs.realtime_tts import text_chunker

def get_llm_stream():
    for word in "The weather is nice today. Let's go outside.".split():
        yield word + " "

# Use text_chunker as preprocessing before manual WebSocket send
for chunk in text_chunker(get_llm_stream()):
    # chunk is now aligned to sentence boundaries
    print(f"Sending: {repr(chunk)}")

Related Pages

Implements Principle

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment