Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Neuml Txtai Audio Signal Processing

From Leeroopedia


Knowledge Sources
Domains Audio, Signal Processing, Utilities
Last Updated 2026-02-10 01:00 GMT

Overview

Concrete tool for audio signal processing utilities provided by txtai.

Description

Signal is a utility class that provides static methods for common audio signal processing operations. It includes methods for converting stereo audio to mono, resampling audio to a target sample rate using scipy, converting between 16-bit integer and 32-bit float representations, mixing two audio segments together with scaling, computing signal energy via FFT for frequency analysis, and trimming leading and trailing silence from audio based on energy thresholds. This class is used extensively by other audio pipelines in txtai as a shared signal processing foundation.

Usage

Use Signal when you need low-level audio signal processing operations such as resampling, format conversion, audio mixing, silence trimming, or frequency energy analysis. It is primarily used internally by other txtai audio pipelines (TextToSpeech, AudioMixer, AudioStream, Microphone, Transcription) but can also be called directly for custom audio processing workflows.

Code Reference

Source Location

  • Repository: Neuml_Txtai
  • File: src/python/txtai/pipeline/audio/signal.py

Signature

class Signal:
    @staticmethod
    def mono(audio)

    @staticmethod
    def resample(audio, rate, target)

    @staticmethod
    def float32(audio)

    @staticmethod
    def int16(audio)

    @staticmethod
    def mix(audio1, audio2, scale1=1, scale2=1)

    @staticmethod
    def energy(audio, rate)

    @staticmethod
    def trim(audio, rate, threshold=1, leading=True, trailing=True)

Import

from txtai.pipeline.audio.signal import Signal

I/O Contract

Inputs

Name Type Required Description
audio numpy.ndarray Yes Input audio data as a NumPy array (used by all methods).
rate int Yes* Current sample rate of the audio. Required by resample, energy, and trim.
target int Yes* Target sample rate for resampling. Required by resample.
scale1 float No Scaling factor for first audio in mix. Defaults to 1.
scale2 float No Scaling factor for second audio in mix. Defaults to 1.
threshold float No Energy threshold for silence detection in trim. Defaults to 1.
leading bool No Whether to trim leading silence. Defaults to True.
trailing bool No Whether to trim trailing silence. Defaults to True.

Outputs

Name Type Description
mono numpy.ndarray Single-channel audio data.
resample numpy.ndarray Audio resampled to target sample rate.
float32 numpy.ndarray Audio converted from int16 to float32 format.
int16 numpy.ndarray Audio converted from float32 to int16 format.
mix numpy.ndarray Two audio segments mixed into one, with the shorter segment tiled to match the longer.
energy dict Dictionary mapping frequency (float) to energy value (float) for the input audio.
trim numpy.ndarray Audio with leading and/or trailing silence removed.

Usage Examples

from txtai.pipeline.audio.signal import Signal
import numpy as np

# Convert stereo to mono
stereo_audio = np.random.randn(22050, 2)
mono_audio = Signal.mono(stereo_audio)

# Resample audio from 44100 Hz to 16000 Hz
audio = np.random.randn(44100).astype(np.float32)
resampled = Signal.resample(audio, 44100, 16000)

# Convert int16 audio to float32
int_audio = np.array([0, 16384, -16384], dtype=np.int16)
float_audio = Signal.float32(int_audio)

# Convert float32 audio back to int16
int_audio = Signal.int16(float_audio)

# Mix two audio segments with scaling
audio1 = np.random.randn(22050).astype(np.float32)
audio2 = np.random.randn(11025).astype(np.float32)
mixed = Signal.mix(audio1, audio2, scale1=1.0, scale2=0.5)

# Calculate signal energy per frequency
energy_map = Signal.energy(audio1, 22050)

# Trim silence from audio
trimmed = Signal.trim(audio1, 22050, threshold=1.0)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment