Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Facebookresearch Audiocraft MagnetSolver

From Leeroopedia
Revision as of 12:33, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Facebookresearch_Audiocraft_MagnetSolver.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Audio_Generation, Training
Last Updated 2026-02-14 01:00 GMT

Overview

Concrete tool for training MAGNeT masked audio generation models using span-based masking and per-codebook cross-entropy loss.

Description

MagnetSolver extends MusicGenSolver to implement the MAGNeT training procedure. Each training step randomly selects a codebook level, applies span-based masking with a cosine schedule, and computes cross-entropy loss only on the masked positions. AudioMagnetSolver further specializes this for environmental sound generation.

Usage

Use this solver when training MAGNeT models for non-autoregressive text-to-music or text-to-sound generation.

Code Reference

Source Location

Signature

class MagnetSolver(musicgen.MusicGenSolver):
    def __init__(self, cfg: DictConfig): ...
    def run_step(self, idx: int, batch, metrics: dict) -> dict: ...
    def _get_mask(self, mask_probs, B, T, device) -> torch.Tensor: ...
    def _compute_cross_entropy_magnet(self, logits, targets, mask, stage) -> torch.Tensor: ...

class AudioMagnetSolver(MagnetSolver):
    DATASET_TYPE = builders.DatasetType.SOUND

Import

from audiocraft.solvers.magnet import MagnetSolver, AudioMagnetSolver

I/O Contract

Inputs

Name Type Required Description
batch tuple Yes (audio_tokens, segment_attributes) tuple
cfg DictConfig Yes Hydra config with masking params

Outputs

Name Type Description
metrics dict Training metrics including ce, ppl, lr

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment