Implementation:Facebookresearch Audiocraft MagnetSolver
| Knowledge Sources | |
|---|---|
| Domains | Audio_Generation, Training |
| Last Updated | 2026-02-14 01:00 GMT |
Overview
Concrete tool for training MAGNeT masked audio generation models using span-based masking and per-codebook cross-entropy loss.
Description
MagnetSolver extends MusicGenSolver to implement the MAGNeT training procedure. Each training step randomly selects a codebook level, applies span-based masking with a cosine schedule, and computes cross-entropy loss only on the masked positions. AudioMagnetSolver further specializes this for environmental sound generation.
Usage
Use this solver when training MAGNeT models for non-autoregressive text-to-music or text-to-sound generation.
Code Reference
Source Location
- Repository: Facebookresearch_Audiocraft
- File: audiocraft/solvers/magnet.py
- Lines: 1-276
Signature
class MagnetSolver(musicgen.MusicGenSolver):
def __init__(self, cfg: DictConfig): ...
def run_step(self, idx: int, batch, metrics: dict) -> dict: ...
def _get_mask(self, mask_probs, B, T, device) -> torch.Tensor: ...
def _compute_cross_entropy_magnet(self, logits, targets, mask, stage) -> torch.Tensor: ...
class AudioMagnetSolver(MagnetSolver):
DATASET_TYPE = builders.DatasetType.SOUND
Import
from audiocraft.solvers.magnet import MagnetSolver, AudioMagnetSolver
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| batch | tuple | Yes | (audio_tokens, segment_attributes) tuple |
| cfg | DictConfig | Yes | Hydra config with masking params |
Outputs
| Name | Type | Description |
|---|---|---|
| metrics | dict | Training metrics including ce, ppl, lr |