Implementation:Facebookresearch Audiocraft MagnetLMModel
| Knowledge Sources | |
|---|---|
| Domains | Audio_Generation, Transformer, Masked_Generation |
| Last Updated | 2026-02-14 01:00 GMT |
Overview
Concrete tool for non-autoregressive masked audio token generation using iterative parallel decoding provided by the AudioCraft library.
Description
MagnetLMModel extends the base LMModel transformer to support MAGNeT-style masked generation. Instead of autoregressive left-to-right token prediction, it uses iterative parallel decoding: tokens are initially fully masked, then progressively revealed over multiple decoding steps using a cosine masking schedule. Each codebook level is decoded independently with its own set of decoding steps, and classifier-free guidance is annealed from a maximum to minimum coefficient across steps.
Usage
Import this class when building or loading MAGNeT models for text-to-music or text-to-sound generation. It replaces the autoregressive generation of standard LMModel with faster parallel masked decoding.
Code Reference
Source Location
- Repository: Facebookresearch_Audiocraft
- File: audiocraft/models/lm_magnet.py
- Lines: 1-500
Signature
class MagnetLMModel(LMModel):
def __init__(self, subcodes_context: int = 5, compression_model_framerate: int = 50,
segment_duration: int = 10, span_len: int = 3, **kwargs):
"""
Args:
subcodes_context: Number of cross-attention steps for sub-codebooks.
compression_model_framerate: Frame rate of the compression model.
segment_duration: Duration of audio segments in seconds.
span_len: Length of atomic masking spans.
"""
Import
from audiocraft.models.lm_magnet import MagnetLMModel
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| sequence | torch.Tensor | Yes | Input token sequences [B, K, T] |
| conditions | list | Yes | List of ConditioningAttributes |
| condition_tensors | dict | No | Pre-computed condition tensors |
| stage | int | No | Current codebook stage for training |
Outputs
| Name | Type | Description |
|---|---|---|
| logits | torch.Tensor | Predicted logits [B, K, T, card] |
| mask | torch.Tensor | Valid token mask [B, K, T] |
Usage Examples
Loading via MAGNeT Model
from audiocraft.models import MAGNeT
# MagnetLMModel is loaded internally by MAGNeT.get_pretrained
model = MAGNeT.get_pretrained('facebook/magnet-small-10secs')
model.set_generation_params(
use_sampling=True,
top_p=0.9,
temperature=3.0,
decoding_steps=[20, 10, 10, 10],
)
wav = model.generate(['dog barking in a park'])