Implementation:Facebookresearch Audiocraft AudioGen get pretrained
| Knowledge Sources | |
|---|---|
| Domains | Audio_Generation, Sound_Generation |
| Last Updated | 2026-02-14 01:00 GMT |
Overview
Concrete tool for loading pretrained AudioGen text-to-sound generation models and configuring their generation parameters.
Description
AudioGen provides a high-level user-facing API for text-to-sound generation. It wraps a compression model and a language model into a unified generation interface via BaseGenModel. The get_pretrained static method loads pretrained models from HuggingFace, and set_generation_params configures sampling strategy, temperature, duration, and classifier-free guidance.
Usage
Import this class when you want to generate environmental sounds from text descriptions using a pretrained AudioGen model.
Code Reference
Source Location
- Repository: Facebookresearch_Audiocraft
- File: audiocraft/models/audiogen.py
- Lines: 1-93
Signature
class AudioGen(BaseGenModel):
@staticmethod
def get_pretrained(name: str = 'facebook/audiogen-medium', device=None):
"""Load a pretrained AudioGen model."""
def set_generation_params(self, use_sampling=True, top_k=250, top_p=0.0,
temperature=1.0, duration=10.0, cfg_coef=3.0,
two_step_cfg=False, extend_stride=2):
"""Configure generation parameters."""
Import
from audiocraft.models import AudioGen
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| descriptions | list[str] | Yes | Text descriptions for generation |
| name | str | No | Pretrained model name (default 'facebook/audiogen-medium') |
Outputs
| Name | Type | Description |
|---|---|---|
| wav | torch.Tensor | Generated audio [B, C, T] |
Usage Examples
from audiocraft.models import AudioGen
model = AudioGen.get_pretrained('facebook/audiogen-medium')
model.set_generation_params(duration=5.0)
wav = model.generate(['dog barking in a park', 'rain on a tin roof'])
# wav shape: [2, 1, 80000] at 16kHz