Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Run llama Llama index BaseVoiceAgentInterface

From Leeroopedia

Overview

The BaseVoiceAgentInterface class is an abstract base class that defines the contract for audio input/output interfaces used by voice agents. It specifies the methods that any concrete audio interface must implement: speaker and microphone callbacks, lifecycle management (start/stop/interrupt), audio output processing, and audio data reception. This abstraction allows different audio hardware or software backends to be used interchangeably.

Source File: llama-index-core/llama_index/core/voice_agents/interface.py

Module: llama_index.core.voice_agents.interface

Lines of Code: 115

Dependencies

Dependency Type Purpose
abc.ABC Standard Library Abstract base class support
abc.abstractmethod Standard Library Decorator for abstract methods
typing.Any Standard Library Flexible type annotations for audio data

Class: BaseVoiceAgentInterface

class BaseVoiceAgentInterface(ABC)

All methods in this class are abstract, making it a pure interface definition. Concrete implementations must provide every method.

Abstract Methods

__init__

@abstractmethod
def __init__(self, *args: Any, **kwargs: Any) -> None

The constructor is explicitly abstract, requiring subclasses to define their own initialization with whatever parameters are needed for their specific audio backend (sample rate, buffer size, device selection, etc.).

_speaker_callback

@abstractmethod
def _speaker_callback(self, *args: Any, **kwargs: Any) -> Any

Callback function for the audio output device (speaker). This method is invoked when the audio system needs to output audio data. The implementation should handle writing audio samples to the output device or buffer.

Aspect Detail
Access Private (prefixed with underscore)
Parameters Flexible (*args, **kwargs)
Returns Any - Implementation-dependent

_microphone_callback

@abstractmethod
def _microphone_callback(self, *args: Any, **kwargs: Any) -> Any

Callback function for the audio input device (microphone). This method is invoked when the audio system captures audio data from the input device. The implementation should handle reading audio samples and making them available for processing.

Aspect Detail
Access Private (prefixed with underscore)
Parameters Flexible (*args, **kwargs)
Returns Any - Implementation-dependent

start

@abstractmethod
def start(self, *args: Any, **kwargs: Any) -> None

Starts the audio interface. Implementations should initialize audio streams, open device connections, and begin capturing/playing audio.

stop

@abstractmethod
def stop(self) -> None

Stops the audio interface. Implementations should close audio streams, release device resources, and perform cleanup.

interrupt

@abstractmethod
def interrupt(self) -> None

Interrupts the audio interface. Used for scenarios such as barge-in, where the output audio must be stopped immediately because the user has started speaking. Implementations should clear output buffers and halt playback.

output

@abstractmethod
def output(self, *args: Any, **kwargs: Any) -> Any

Processes and outputs audio data. This method handles the delivery of audio to the output device (speaker). The flexible signature accommodates various audio formats and output parameters.

receive

@abstractmethod
def receive(self, data: Any, *args: Any, **kwargs: Any) -> Any

Receives audio data from an external source. The data parameter is typed as Any to accommodate various audio formats (bytes, strings, numpy arrays, etc.).

Parameter Type Description
data Any Received audio data (typically bytes or string, but open to other formats)

Method Summary

Method Synchronous Purpose Access
__init__ Yes Initialize audio backend Public
_speaker_callback Yes Handle audio output device callback Private
_microphone_callback Yes Handle audio input device callback Private
start Yes Start audio streams Public
stop Yes Stop audio streams Public
interrupt Yes Interrupt audio playback Public
output Yes Process and output audio Public
receive Yes Receive external audio data Public

Design Patterns

Pure Interface

Unlike BaseVoiceAgent, which provides some concrete methods, BaseVoiceAgentInterface is a pure interface with every method being abstract. This enforces a strict contract on all implementations while leaving maximum flexibility in how the interface is realized.

Callback Pattern

The _speaker_callback and _microphone_callback methods follow the callback pattern commonly used in audio programming. Audio libraries (such as PyAudio or sounddevice) typically require callback functions that are invoked by the audio system's event loop. By making these abstract, the interface supports any audio backend that uses callbacks.

Synchronous Design

All methods are synchronous (not async), reflecting that audio I/O callbacks are typically invoked from audio driver threads rather than asyncio event loops. The voice agent (BaseVoiceAgent) bridges the async/sync boundary by using its async methods to coordinate the synchronous audio interface.

Separation of Input and Output

The interface explicitly separates:

  • Input: _microphone_callback for capturing audio, receive for receiving audio data
  • Output: _speaker_callback for the output device, output for processing/delivering audio

This separation enables clear data flow management in voice applications.

See Also

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment