Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Facebookresearch Audiocraft RelativeVolumeMel

From Leeroopedia
Revision as of 12:33, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Facebookresearch_Audiocraft_RelativeVolumeMel.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Audio_Metrics, Diffusion
Last Updated 2026-02-14 01:00 GMT

Overview

Concrete tool for measuring the relative volume of distortion at the mel-spectrogram level between generated and reference audio in decibels.

Description

RelativeVolumeMel computes the RVM metric from the Multi-Band Diffusion paper. It normalizes both signals by the ground truth RMS, computes mel spectrograms, measures the delta in dB, and aggregates into overall and per-band scores. Negative values indicate lower distortion.

Usage

Import this metric when evaluating diffusion-based audio generation or enhancement quality.

Code Reference

Source Location

Signature

class RelativeVolumeMel(nn.Module):
    def __init__(self, sample_rate=24000, n_mels=80, n_fft=512, hop_length=128, ...): ...
    def forward(self, estimate: torch.Tensor, ground_truth: torch.Tensor) -> tp.Dict[str, torch.Tensor]: ...

Import

from audiocraft.metrics.rvm import RelativeVolumeMel

I/O Contract

Inputs

Name Type Required Description
estimate torch.Tensor Yes Generated audio [B, C, T]
ground_truth torch.Tensor Yes Reference audio [B, C, T]

Outputs

Name Type Description
metrics Dict[str, torch.Tensor] Dict with "rvm" (overall) and "rvm_0"..."rvm_3" (per band)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment