Implementation:Neuml Txtai Audio Signal Processing

Knowledge Sources	Neuml_Txtai
Domains	Audio, Signal Processing, Utilities
Last Updated	2026-02-10 01:00 GMT

Overview

Concrete tool for audio signal processing utilities provided by txtai.

Description

Signal is a utility class that provides static methods for common audio signal processing operations. It includes methods for converting stereo audio to mono, resampling audio to a target sample rate using scipy, converting between 16-bit integer and 32-bit float representations, mixing two audio segments together with scaling, computing signal energy via FFT for frequency analysis, and trimming leading and trailing silence from audio based on energy thresholds. This class is used extensively by other audio pipelines in txtai as a shared signal processing foundation.

Usage

Use Signal when you need low-level audio signal processing operations such as resampling, format conversion, audio mixing, silence trimming, or frequency energy analysis. It is primarily used internally by other txtai audio pipelines (TextToSpeech, AudioMixer, AudioStream, Microphone, Transcription) but can also be called directly for custom audio processing workflows.

Code Reference

Source Location

Repository: Neuml_Txtai
File: src/python/txtai/pipeline/audio/signal.py

Signature

class Signal:
    @staticmethod
    def mono(audio)

    @staticmethod
    def resample(audio, rate, target)

    @staticmethod
    def float32(audio)

    @staticmethod
    def int16(audio)

    @staticmethod
    def mix(audio1, audio2, scale1=1, scale2=1)

    @staticmethod
    def energy(audio, rate)

    @staticmethod
    def trim(audio, rate, threshold=1, leading=True, trailing=True)

Import

from txtai.pipeline.audio.signal import Signal

I/O Contract

Inputs

Name	Type	Required	Description
audio	numpy.ndarray	Yes	Input audio data as a NumPy array (used by all methods).
rate	int	Yes*	Current sample rate of the audio. Required by resample, energy, and trim.
target	int	Yes*	Target sample rate for resampling. Required by resample.
scale1	float	No	Scaling factor for first audio in mix. Defaults to 1.
scale2	float	No	Scaling factor for second audio in mix. Defaults to 1.
threshold	float	No	Energy threshold for silence detection in trim. Defaults to 1.
leading	bool	No	Whether to trim leading silence. Defaults to True.
trailing	bool	No	Whether to trim trailing silence. Defaults to True.

Outputs

Name	Type	Description
mono	numpy.ndarray	Single-channel audio data.
resample	numpy.ndarray	Audio resampled to target sample rate.
float32	numpy.ndarray	Audio converted from int16 to float32 format.
int16	numpy.ndarray	Audio converted from float32 to int16 format.
mix	numpy.ndarray	Two audio segments mixed into one, with the shorter segment tiled to match the longer.
energy	dict	Dictionary mapping frequency (float) to energy value (float) for the input audio.
trim	numpy.ndarray	Audio with leading and/or trailing silence removed.

Usage Examples

from txtai.pipeline.audio.signal import Signal
import numpy as np

# Convert stereo to mono
stereo_audio = np.random.randn(22050, 2)
mono_audio = Signal.mono(stereo_audio)

# Resample audio from 44100 Hz to 16000 Hz
audio = np.random.randn(44100).astype(np.float32)
resampled = Signal.resample(audio, 44100, 16000)

# Convert int16 audio to float32
int_audio = np.array([0, 16384, -16384], dtype=np.int16)
float_audio = Signal.float32(int_audio)

# Convert float32 audio back to int16
int_audio = Signal.int16(float_audio)

# Mix two audio segments with scaling
audio1 = np.random.randn(22050).astype(np.float32)
audio2 = np.random.randn(11025).astype(np.float32)
mixed = Signal.mix(audio1, audio2, scale1=1.0, scale2=0.5)

# Calculate signal energy per frequency
energy_map = Signal.energy(audio1, 22050)

# Trim silence from audio
trimmed = Signal.trim(audio1, 22050, threshold=1.0)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment