Implementation:Facebookresearch Audiocraft JASCO set generation params
Overview
JASCO_set_generation_params configures the multi-source classifier-free guidance coefficients and other generation parameters for JASCO's flow matching generation process.
Implements
Principle:Facebookresearch_Audiocraft_JASCO_Generation_Configuration
API Signature
def set_generation_params(
self,
cfg_coef_all: float = 5.0,
cfg_coef_txt: float = 0.0,
**kwargs
) -> None
Source: audiocraft/models/jasco.py, lines 66-83
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
cfg_coef_all |
float |
5.0 |
Weight for the multi-source CFG all-conditions term. Higher values increase adherence to all provided conditions (text + temporal) |
cfg_coef_txt |
float |
0.0 |
Weight for the text-only CFG term. When non-zero, adds an additional guidance term that pushes generation toward the text description specifically |
**kwargs |
dict |
empty | Additional parameters forwarded to FlowMatchingModel.generate(), such as euler, euler_steps, ode_rtol, ode_atol
|
Return Value
Returns None. Modifies self.generation_params in place.
How It Works
from audiocraft.models import JASCO
model = JASCO.get_pretrained('facebook/jasco-chords-drums-400M')
# Default configuration: strong all-conditions guidance, no text-only term
model.set_generation_params(cfg_coef_all=5.0, cfg_coef_txt=0.0)
# Stronger text adherence
model.set_generation_params(cfg_coef_all=3.0, cfg_coef_txt=2.0)
# Use Euler integration with specific step count
model.set_generation_params(cfg_coef_all=5.0, euler=True, euler_steps=100)
# Adjust ODE solver tolerance for quality vs. speed trade-off
model.set_generation_params(cfg_coef_all=5.0, ode_rtol=1e-4, ode_atol=1e-4)
The implementation:
def set_generation_params(self, cfg_coef_all=5.0, cfg_coef_txt=0.0, **kwargs):
self.generation_params = {
'cfg_coef_all': cfg_coef_all,
'cfg_coef_txt': cfg_coef_txt
}
self.generation_params.update(kwargs)
Forward Pass: How Parameters Are Used
The generation_params dictionary is unpacked into FlowMatchingModel.generate() during the generation call chain:
# In JASCO._generate_tokens():
return self.lm.generate(
prompt_tokens, attributes,
callback=callback, max_gen_len=total_gen_len,
**self.generation_params # cfg_coef_all, cfg_coef_txt, plus any kwargs
)
Inside FlowMatchingModel.generate(), these parameters control the multi-source CFG preprocessing:
# In FlowMatchingModel.generate():
condition_tensors, cfg_terms = self._multi_source_cfg_preprocess(
conditions, cfg_coef_all, cfg_coef_txt
)
Passthrough Parameters
Additional **kwargs that can be passed to control the ODE solver:
| Parameter | Type | Default | Description |
|---|---|---|---|
euler |
bool |
False |
Use fixed-step Euler integration instead of adaptive ODE solver |
euler_steps |
int |
100 |
Number of Euler steps when euler=True
|
ode_rtol |
float |
1e-5 |
Relative tolerance for the adaptive ODE solver (dopri5) |
ode_atol |
float |
1e-5 |
Absolute tolerance for the adaptive ODE solver (dopri5) |
Comparison with MusicGen Configuration
| Aspect | MusicGen | JASCO |
|---|---|---|
| CFG coefficients | Single cfg_coef |
Multi-source: cfg_coef_all + cfg_coef_txt
|
| Sampling params | use_sampling, top_k, top_p, temperature |
Not applicable (continuous, not discrete) |
| Duration control | User-configurable duration |
Fixed from training config (segment_duration)
|
| Generation method | Autoregressive token sampling | ODE integration (Euler or adaptive) |