Implementation:Facebookresearch Audiocraft MultiBandDiffusion

Knowledge Sources	Facebookresearch_Audiocraft Multi-Band Diffusion
Domains	Audio_Generation, Diffusion
Last Updated	2026-02-14 01:00 GMT

Overview

Concrete tool for converting discrete audio tokens to high-fidelity waveforms using multiple band-specific diffusion processes.

Description

MultiBandDiffusion orchestrates multiple DiffusionProcess instances (one per frequency band) alongside a shared codec model. Each diffusion process generates a waveform for its frequency band, and the outputs are summed. An EQ matching step aligns the spectral profile with the codec decoder output. It provides get_mbd_musicgen and get_mbd_24khz factory methods for loading pretrained models.

Usage

Import this class when you want to improve the audio quality of codec-generated audio by replacing the codec decoder with diffusion-based decoding.

Code Reference

Source Location

Repository: Facebookresearch_Audiocraft
File: audiocraft/models/multibanddiffusion.py
Lines: 1-191

Signature

class MultiBandDiffusion:
    def __init__(self, DPs: tp.List[DiffusionProcess], codec_model: CompressionModel): ...

    @staticmethod
    def get_mbd_musicgen(device=None): ...

    @staticmethod
    def get_mbd_24khz(bw=3.0, device=None, n_q=None): ...

    def tokens_to_wav(self, tokens: torch.Tensor, n_bands: int = 32) -> torch.Tensor: ...
    def regenerate(self, wav: torch.Tensor, sample_rate: int) -> torch.Tensor: ...
    def generate(self, emb: torch.Tensor, size=None, step_list=None) -> torch.Tensor: ...

Import

from audiocraft.models import MultiBandDiffusion

I/O Contract

Inputs

Name	Type	Required	Description
tokens	torch.Tensor	Yes	Discrete codec tokens [B, K, T] (for tokens_to_wav)
wav	torch.Tensor	No	Audio waveform (for regenerate)
emb	torch.Tensor	No	Codec latent embeddings (for generate)

Outputs

Name	Type	Description
wav	torch.Tensor	Generated/regenerated high-fidelity audio [B, C, T]

Usage Examples

from audiocraft.models import MultiBandDiffusion, MusicGen

# Load MBD for MusicGen
mbd = MultiBandDiffusion.get_mbd_musicgen()

# Generate tokens with MusicGen, then decode with MBD
musicgen = MusicGen.get_pretrained('facebook/musicgen-small')
tokens = musicgen.generate(['ambient piano music'], return_tokens=True)
wav = mbd.tokens_to_wav(tokens)

Related Pages

Principle:Facebookresearch_Audiocraft_Diffusion_UNet_Architecture

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment