Implementation:Facebookresearch Audiocraft LoudnessLoss
| Knowledge Sources | |
|---|---|
| Domains | Audio_Processing, Loss_Functions |
| Last Updated | 2026-02-14 01:00 GMT |
Overview
Concrete tool for computing perceptual loudness-ratio-based losses between output and reference audio signals in time, frequency, and combined time-frequency domains.
Description
This module implements three loudness-based loss functions: FLoudnessRatio (frequency-domain SNR across mel bands), TLoudnessRatio (time-domain SNR across overlapping frames), and TFLoudnessRatio (combined time-frequency loudness ratio). All losses use softmax weighting so that the noisiest regions dominate the loss, providing perceptually-focused training signals for audio compression and watermarking.
Usage
Import these loss functions when training audio compression or watermarking models where perceptual quality assessment based on loudness ratios is needed.
Code Reference
Source Location
- Repository: Facebookresearch_Audiocraft
- File: audiocraft/losses/loudnessloss.py
- Lines: 1-204
Signature
class FLoudnessRatio(nn.Module):
def __init__(self, sample_rate=16000, segment=20, overlap=0.5, epsilon=..., n_bands=0):
"""Frequency-domain loudness ratio loss."""
def forward(self, out_sig: torch.Tensor, ref_sig: torch.Tensor) -> torch.Tensor: ...
class TLoudnessRatio(nn.Module):
def __init__(self, sample_rate=16000, segment=0.5, overlap=0.5):
"""Time-domain loudness ratio loss."""
def forward(self, out_sig: torch.Tensor, ref_sig: torch.Tensor) -> torch.Tensor: ...
class TFLoudnessRatio(nn.Module):
def __init__(self, sample_rate=16000, segment=0.5, overlap=0.5, n_bands=0, ...):
"""Combined time-frequency loudness ratio loss."""
def forward(self, out_sig: torch.Tensor, ref_sig: torch.Tensor) -> torch.Tensor: ...
Import
from audiocraft.losses.loudnessloss import FLoudnessRatio, TLoudnessRatio, TFLoudnessRatio
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| out_sig | torch.Tensor | Yes | Output audio signal [B, C, T] |
| ref_sig | torch.Tensor | Yes | Reference audio signal [B, C, T] |
Outputs
| Name | Type | Description |
|---|---|---|
| loss | torch.Tensor | Scalar loudness ratio loss |
Usage Examples
from audiocraft.losses.loudnessloss import TFLoudnessRatio
import torch
loss_fn = TFLoudnessRatio(sample_rate=16000, n_bands=32)
output = torch.randn(4, 1, 16000)
reference = torch.randn(4, 1, 16000)
loss = loss_fn(output, reference)