Principle:Facebookresearch Audiocraft Token Sampling Strategies
| Knowledge Sources | |
|---|---|
| Domains | Audio_Generation, Sampling |
| Last Updated | 2026-02-14 01:00 GMT |
Overview
Strategies for sampling discrete tokens from a probability distribution during autoregressive or iterative audio generation, including top-k filtering and nucleus (top-p) sampling.
Description
Token Sampling Strategies control the randomness and quality of generated audio by filtering the predicted probability distribution before sampling. Top-k sampling restricts the distribution to the k most likely tokens, while nucleus (top-p) sampling dynamically selects the minimal set of tokens whose cumulative probability exceeds p. These strategies balance diversity and quality in audio token generation.
Usage
Use these strategies when configuring the generation parameters of MusicGen, AudioGen, MAGNeT, or JASCO models. The choice of sampling strategy significantly affects the quality and diversity of generated audio.
Theoretical Basis
Top-k Sampling: Zero out all probabilities except the top-k tokens, then renormalize.
Nucleus (Top-p) Sampling: Sort tokens by probability, compute cumulative distribution, zero out tokens beyond the p threshold: