Implementation:Facebookresearch Audiocraft MusicGen get pretrained

Summary

MusicGen.get_pretrained is a static factory method that instantiates a fully configured MusicGen object from a pretrained checkpoint. It downloads (or loads from cache) both the compression model and the language model, places them on the target device, and returns a ready-to-use generation interface.

API Signature

@staticmethod
def get_pretrained(name: str = 'facebook/musicgen-melody', device=None) -> MusicGen

Parameters

Parameter	Type	Default	Description
`name`	`str`	`'facebook/musicgen-melody'`	HuggingFace model ID or local path to the checkpoint. Supported values: `'facebook/musicgen-small'` (300M), `'facebook/musicgen-medium'` (1.5B), `'facebook/musicgen-large'` (3.3B), `'facebook/musicgen-melody'` (1.5B, text+melody), `'facebook/musicgen-style'` (1.5B, text+style).
`device`	`str` or `torch.device` or `None`	`None`	Target device. If `None`, automatically selects `'cuda'` if available, otherwise `'cpu'`.

Return Value

Type	Description
`MusicGen`	Fully initialized MusicGen instance with `compression_model` and `lm` attributes loaded and set to evaluation mode. Default generation duration is set to 15 seconds.

Source Location

File: audiocraft/models/musicgen.py, lines 57-94
Class: MusicGen (extends BaseGenModel)
Import: from audiocraft.models import MusicGen

Internal Workflow

The method follows this sequence:

Device resolution: If device is None, check torch.cuda.device_count() and select 'cuda' or 'cpu'.
Debug mode check: If name == 'debug', return a lightweight debug model (unit testing only).
Legacy name mapping: If name matches a short alias (e.g., 'small', 'melody'), map it to the full HuggingFace ID via _HF_MODEL_CHECKPOINTS_MAP and emit a deprecation warning.
Load language model: Call load_lm_model(name, device=device) from audiocraft/models/loaders.py (lines 111-126).
Load compression model: Call load_compression_model(name, device=device) from audiocraft/models/loaders.py (lines 78-91).
Configure conditioners: If the LM has a 'self_wav' conditioner (melody or style models), set match_len_on_eval = True and _use_masking = False.
Construct and return: Return MusicGen(name, compression_model, lm), which internally calls set_generation_params(duration=15).

Internal Calls

Function	Location	Purpose
`load_lm_model()`	`audiocraft/models/loaders.py`, lines 111-126	Downloads and reconstructs the transformer language model from `state_dict.bin`. Handles HuggingFace Hub download, OmegaConf config reconstruction, dtype selection (float16 on GPU, float32 on CPU), and builder-based model instantiation.
`load_compression_model()`	`audiocraft/models/loaders.py`, lines 78-91	Downloads and reconstructs the audio compression model from `compression_state_dict.bin`. Supports both custom-trained EnCodec models and HuggingFace pretrained references.
`_get_state_dict()`	`audiocraft/models/loaders.py`, lines 40-71	Low-level checkpoint loading utility. Supports local files, local directories, HTTPS URLs, and HuggingFace Hub IDs. Uses `hf_hub_download` for hub-based retrieval with caching via `AUDIOCRAFT_CACHE_DIR`.
`builders.get_lm_model()`	`audiocraft/models/builders.py`	Reconstructs the `LMModel` architecture from the OmegaConf configuration embedded in the checkpoint.
`builders.get_compression_model()`	`audiocraft/models/builders.py`	Reconstructs the `EncodecModel` architecture from its stored configuration.

Example Usage

from audiocraft.models import MusicGen

# Load the melody-conditioned model on GPU
model = MusicGen.get_pretrained('facebook/musicgen-melody')

# Configure generation parameters
model.set_generation_params(duration=8.0)

# Generate audio from text
wav = model.generate(['cheerful acoustic guitar melody'])

# Load a smaller model on CPU
model_small = MusicGen.get_pretrained('facebook/musicgen-small', device='cpu')

Name Resolution Map

The following legacy short names are mapped to full HuggingFace IDs for backward compatibility:

Short Name	Full HuggingFace ID	Description
`'small'`	`'facebook/musicgen-small'`	300M parameter text-to-music model
`'medium'`	`'facebook/musicgen-medium'`	1.5B parameter text-to-music model
`'large'`	`'facebook/musicgen-large'`	3.3B parameter text-to-music model
`'melody'`	`'facebook/musicgen-melody'`	1.5B parameter text+melody-to-music model
`'style'`	`'facebook/musicgen-style'`	1.5B parameter text+style-to-music model

Dependencies

torch - Model loading, device management, state dict handling
huggingface_hub - Downloading checkpoints from HuggingFace Hub
omegaconf - Parsing and managing model configuration from checkpoints

Related Pages

Principle:Facebookresearch_Audiocraft_Pretrained_Model_Loading
Implementation:Facebookresearch_Audiocraft_MusicGen_set_generation_params - Typically called immediately after loading to configure generation behavior.
Implementation:Facebookresearch_Audiocraft_Audiocraft_Installation - Environment setup required before loading models.
Environment:Facebookresearch_Audiocraft_Python_PyTorch_CUDA_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment