Implementation:Run llama Llama index BaseVoiceAgentInterface
Overview
The BaseVoiceAgentInterface class is an abstract base class that defines the contract for audio input/output interfaces used by voice agents. It specifies the methods that any concrete audio interface must implement: speaker and microphone callbacks, lifecycle management (start/stop/interrupt), audio output processing, and audio data reception. This abstraction allows different audio hardware or software backends to be used interchangeably.
Source File: llama-index-core/llama_index/core/voice_agents/interface.py
Module: llama_index.core.voice_agents.interface
Lines of Code: 115
Dependencies
| Dependency | Type | Purpose |
|---|---|---|
abc.ABC |
Standard Library | Abstract base class support |
abc.abstractmethod |
Standard Library | Decorator for abstract methods |
typing.Any |
Standard Library | Flexible type annotations for audio data |
Class: BaseVoiceAgentInterface
class BaseVoiceAgentInterface(ABC)
All methods in this class are abstract, making it a pure interface definition. Concrete implementations must provide every method.
Abstract Methods
__init__
@abstractmethod def __init__(self, *args: Any, **kwargs: Any) -> None
The constructor is explicitly abstract, requiring subclasses to define their own initialization with whatever parameters are needed for their specific audio backend (sample rate, buffer size, device selection, etc.).
_speaker_callback
@abstractmethod def _speaker_callback(self, *args: Any, **kwargs: Any) -> Any
Callback function for the audio output device (speaker). This method is invoked when the audio system needs to output audio data. The implementation should handle writing audio samples to the output device or buffer.
| Aspect | Detail |
|---|---|
| Access | Private (prefixed with underscore) |
| Parameters | Flexible (*args, **kwargs)
|
| Returns | Any - Implementation-dependent
|
_microphone_callback
@abstractmethod def _microphone_callback(self, *args: Any, **kwargs: Any) -> Any
Callback function for the audio input device (microphone). This method is invoked when the audio system captures audio data from the input device. The implementation should handle reading audio samples and making them available for processing.
| Aspect | Detail |
|---|---|
| Access | Private (prefixed with underscore) |
| Parameters | Flexible (*args, **kwargs)
|
| Returns | Any - Implementation-dependent
|
start
@abstractmethod def start(self, *args: Any, **kwargs: Any) -> None
Starts the audio interface. Implementations should initialize audio streams, open device connections, and begin capturing/playing audio.
stop
@abstractmethod def stop(self) -> None
Stops the audio interface. Implementations should close audio streams, release device resources, and perform cleanup.
interrupt
@abstractmethod def interrupt(self) -> None
Interrupts the audio interface. Used for scenarios such as barge-in, where the output audio must be stopped immediately because the user has started speaking. Implementations should clear output buffers and halt playback.
output
@abstractmethod def output(self, *args: Any, **kwargs: Any) -> Any
Processes and outputs audio data. This method handles the delivery of audio to the output device (speaker). The flexible signature accommodates various audio formats and output parameters.
receive
@abstractmethod def receive(self, data: Any, *args: Any, **kwargs: Any) -> Any
Receives audio data from an external source. The data parameter is typed as Any to accommodate various audio formats (bytes, strings, numpy arrays, etc.).
| Parameter | Type | Description |
|---|---|---|
data |
Any |
Received audio data (typically bytes or string, but open to other formats) |
Method Summary
| Method | Synchronous | Purpose | Access |
|---|---|---|---|
__init__ |
Yes | Initialize audio backend | Public |
_speaker_callback |
Yes | Handle audio output device callback | Private |
_microphone_callback |
Yes | Handle audio input device callback | Private |
start |
Yes | Start audio streams | Public |
stop |
Yes | Stop audio streams | Public |
interrupt |
Yes | Interrupt audio playback | Public |
output |
Yes | Process and output audio | Public |
receive |
Yes | Receive external audio data | Public |
Design Patterns
Pure Interface
Unlike BaseVoiceAgent, which provides some concrete methods, BaseVoiceAgentInterface is a pure interface with every method being abstract. This enforces a strict contract on all implementations while leaving maximum flexibility in how the interface is realized.
Callback Pattern
The _speaker_callback and _microphone_callback methods follow the callback pattern commonly used in audio programming. Audio libraries (such as PyAudio or sounddevice) typically require callback functions that are invoked by the audio system's event loop. By making these abstract, the interface supports any audio backend that uses callbacks.
Synchronous Design
All methods are synchronous (not async), reflecting that audio I/O callbacks are typically invoked from audio driver threads rather than asyncio event loops. The voice agent (BaseVoiceAgent) bridges the async/sync boundary by using its async methods to coordinate the synchronous audio interface.
Separation of Input and Output
The interface explicitly separates:
- Input:
_microphone_callbackfor capturing audio,receivefor receiving audio data - Output:
_speaker_callbackfor the output device,outputfor processing/delivering audio
This separation enables clear data flow management in voice applications.
See Also
- Run_llama_Llama_index_BaseVoiceAgent - Voice agent base class that uses this interface
- Run_llama_Llama_index_BaseVoiceAgentWebsocket - Websocket base class for voice service communication