Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Facebookresearch Audiocraft Sound Dataset Augmented Loading

From Leeroopedia
Knowledge Sources
Domains Audio_Data, Data_Augmentation
Last Updated 2026-02-14 01:00 GMT

Overview

A data loading strategy that augments environmental sound datasets by mixing pairs of audio samples at random signal-to-noise ratios to improve sound generation model robustness.

Description

Sound Dataset Augmented Loading is a data pipeline technique used in audio generation training. Rather than training on isolated sound clips, this approach pairs audio samples and mixes them at configurable SNR levels. The mixed audio is presented alongside its text description, helping the model learn to generate sounds that are robust to background noise and overlapping audio events. The pairing is controlled via a pre-computed pairing list that maps each sample to a mixing partner.

Usage

Use this principle when training sound generation models (e.g., AudioGen) on environmental sound datasets where data augmentation through audio mixing can improve the diversity and robustness of generated outputs.

Theoretical Basis

The core idea is SNR-controlled mixing of two audio signals:

ymixed=xprimary+10SNR/20xsecondary

Where SNR is sampled uniformly from [SNR_{low}, SNR_{high}] in decibels. A minimum overlap constraint ensures the secondary signal temporally overlaps the primary. The mixing probability p controls how often augmentation is applied versus returning the original signal.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment