Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Facebookresearch Audiocraft MusicGen set generation params

From Leeroopedia

Summary

MusicGen.set_generation_params is an instance method that configures the sampling strategy, duration, classifier-free guidance strength, and extended generation behavior for subsequent audio generation calls. The method stores these parameters as instance attributes and a generation_params dictionary that is later unpacked and passed to LMModel.generate().

API Signature

def set_generation_params(
    self,
    use_sampling: bool = True,
    top_k: int = 250,
    top_p: float = 0.0,
    temperature: float = 1.0,
    duration: float = 30.0,
    cfg_coef: float = 3.0,
    cfg_coef_beta: Optional[float] = None,
    two_step_cfg: bool = False,
    extend_stride: float = 18,
) -> None

Parameters

Parameter Type Default Description
use_sampling bool True Use probabilistic sampling if True; use argmax (greedy) decoding if False.
top_k int 250 Number of highest-probability tokens to consider during top-k sampling.
top_p float 0.0 Cumulative probability threshold for nucleus (top-p) sampling. When set to 0.0, top-k sampling is used instead.
temperature float 1.0 Softmax temperature parameter. Values below 1.0 produce sharper distributions (more deterministic), values above 1.0 produce flatter distributions (more diverse).
duration float 30.0 Target duration of the generated audio in seconds. If this exceeds max_duration, extended generation with sliding windows is used.
cfg_coef float 3.0 Classifier-free guidance coefficient. Higher values increase adherence to conditioning at the potential cost of artifacts.
cfg_coef_beta Optional[float] None Beta coefficient for double classifier-free guidance (MusicGen-Style only). Controls the balance between text and audio style conditioning. See arXiv:2407.12563, paragraph 4.3.
two_step_cfg bool False If True, performs classifier-free guidance with two separate forward passes instead of batching. Ensures identical padding between train and inference.
extend_stride float 18 Number of seconds to advance the generation window when producing audio longer than max_duration. Must be less than max_duration.

Return Value

Type Description
None Configures model state in-place. No return value.

Source Location

  • File: audiocraft/models/musicgen.py, lines 96-132
  • Class: MusicGen (extends BaseGenModel)
  • Import: Method on a MusicGen instance (no standalone import)

Internal Behavior

The method performs the following operations:

  1. Validates extend_stride: Asserts that extend_stride < self.max_duration to ensure sufficient context overlap during extended generation.
  2. Stores duration and stride: Sets self.extend_stride and self.duration as instance attributes.
  3. Builds generation_params dictionary: Packages the sampling parameters into self.generation_params, a dictionary with keys: use_sampling, temp, top_k, top_p, cfg_coef, two_step_cfg, cfg_coef_beta.

Note that the temperature parameter is stored under the key 'temp' in the dictionary, matching the parameter name expected by LMModel.generate().

self.generation_params = {
    'use_sampling': use_sampling,
    'temp': temperature,
    'top_k': top_k,
    'top_p': top_p,
    'cfg_coef': cfg_coef,
    'two_step_cfg': two_step_cfg,
    'cfg_coef_beta': cfg_coef_beta,
}

How Parameters Flow to Generation

When any generation method is called (e.g., generate(), generate_with_chroma()), the stored parameters are unpacked and passed to the language model:

# In _generate_tokens():
gen_tokens = self.lm.generate(
    prompt_tokens, attributes,
    callback=callback, max_gen_len=total_gen_len,
    **self.generation_params  # unpacks all configured params
)

Example Usage

from audiocraft.models import MusicGen

model = MusicGen.get_pretrained('facebook/musicgen-melody')

# Standard configuration for high-quality generation
model.set_generation_params(
    use_sampling=True,
    top_k=250,
    temperature=1.0,
    duration=8.0,
    cfg_coef=3.0,
)

# More deterministic output with lower temperature
model.set_generation_params(
    use_sampling=True,
    top_k=250,
    temperature=0.7,
    duration=15.0,
    cfg_coef=4.0,
)

# Extended generation beyond max_duration
model.set_generation_params(
    duration=60.0,
    extend_stride=18,
)

Dependencies

  • No external dependencies beyond the MusicGen instance itself. This method only sets Python attributes.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment