Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Facebookresearch Audiocraft MusicGen get pretrained

From Leeroopedia

Summary

MusicGen.get_pretrained is a static factory method that instantiates a fully configured MusicGen object from a pretrained checkpoint. It downloads (or loads from cache) both the compression model and the language model, places them on the target device, and returns a ready-to-use generation interface.

API Signature

@staticmethod
def get_pretrained(name: str = 'facebook/musicgen-melody', device=None) -> MusicGen

Parameters

Parameter Type Default Description
name str 'facebook/musicgen-melody' HuggingFace model ID or local path to the checkpoint. Supported values: 'facebook/musicgen-small' (300M), 'facebook/musicgen-medium' (1.5B), 'facebook/musicgen-large' (3.3B), 'facebook/musicgen-melody' (1.5B, text+melody), 'facebook/musicgen-style' (1.5B, text+style).
device str or torch.device or None None Target device. If None, automatically selects 'cuda' if available, otherwise 'cpu'.

Return Value

Type Description
MusicGen Fully initialized MusicGen instance with compression_model and lm attributes loaded and set to evaluation mode. Default generation duration is set to 15 seconds.

Source Location

  • File: audiocraft/models/musicgen.py, lines 57-94
  • Class: MusicGen (extends BaseGenModel)
  • Import: from audiocraft.models import MusicGen

Internal Workflow

The method follows this sequence:

  1. Device resolution: If device is None, check torch.cuda.device_count() and select 'cuda' or 'cpu'.
  2. Debug mode check: If name == 'debug', return a lightweight debug model (unit testing only).
  3. Legacy name mapping: If name matches a short alias (e.g., 'small', 'melody'), map it to the full HuggingFace ID via _HF_MODEL_CHECKPOINTS_MAP and emit a deprecation warning.
  4. Load language model: Call load_lm_model(name, device=device) from audiocraft/models/loaders.py (lines 111-126).
  5. Load compression model: Call load_compression_model(name, device=device) from audiocraft/models/loaders.py (lines 78-91).
  6. Configure conditioners: If the LM has a 'self_wav' conditioner (melody or style models), set match_len_on_eval = True and _use_masking = False.
  7. Construct and return: Return MusicGen(name, compression_model, lm), which internally calls set_generation_params(duration=15).

Internal Calls

Function Location Purpose
load_lm_model() audiocraft/models/loaders.py, lines 111-126 Downloads and reconstructs the transformer language model from state_dict.bin. Handles HuggingFace Hub download, OmegaConf config reconstruction, dtype selection (float16 on GPU, float32 on CPU), and builder-based model instantiation.
load_compression_model() audiocraft/models/loaders.py, lines 78-91 Downloads and reconstructs the audio compression model from compression_state_dict.bin. Supports both custom-trained EnCodec models and HuggingFace pretrained references.
_get_state_dict() audiocraft/models/loaders.py, lines 40-71 Low-level checkpoint loading utility. Supports local files, local directories, HTTPS URLs, and HuggingFace Hub IDs. Uses hf_hub_download for hub-based retrieval with caching via AUDIOCRAFT_CACHE_DIR.
builders.get_lm_model() audiocraft/models/builders.py Reconstructs the LMModel architecture from the OmegaConf configuration embedded in the checkpoint.
builders.get_compression_model() audiocraft/models/builders.py Reconstructs the EncodecModel architecture from its stored configuration.

Example Usage

from audiocraft.models import MusicGen

# Load the melody-conditioned model on GPU
model = MusicGen.get_pretrained('facebook/musicgen-melody')

# Configure generation parameters
model.set_generation_params(duration=8.0)

# Generate audio from text
wav = model.generate(['cheerful acoustic guitar melody'])

# Load a smaller model on CPU
model_small = MusicGen.get_pretrained('facebook/musicgen-small', device='cpu')

Name Resolution Map

The following legacy short names are mapped to full HuggingFace IDs for backward compatibility:

Short Name Full HuggingFace ID Description
'small' 'facebook/musicgen-small' 300M parameter text-to-music model
'medium' 'facebook/musicgen-medium' 1.5B parameter text-to-music model
'large' 'facebook/musicgen-large' 3.3B parameter text-to-music model
'melody' 'facebook/musicgen-melody' 1.5B parameter text+melody-to-music model
'style' 'facebook/musicgen-style' 1.5B parameter text+style-to-music model

Dependencies

  • torch - Model loading, device management, state dict handling
  • huggingface_hub - Downloading checkpoints from HuggingFace Hub
  • omegaconf - Parsing and managing model configuration from checkpoints

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment