Principle:Facebookresearch Audiocraft D Adaptation Optimization
| Knowledge Sources | |
|---|---|
| Domains | Optimization, Training |
| Last Updated | 2026-02-14 01:00 GMT |
Overview
An automatic learning rate adaptation technique that estimates the distance to the optimal solution online, eliminating the need for manual learning rate tuning.
Description
D-Adaptation methods modify standard optimizers (like Adam) to automatically determine the appropriate step size. By maintaining an online estimate of D (the initial distance to the solution in dual space), the optimizer can provably converge with a single hyperparameter-free learning rate setting.
Usage
Use this principle when training models where learning rate search is expensive or impractical. It provides theoretical convergence guarantees without manual tuning.
Theoretical Basis
The method maintains an estimate of D that is updated each step:
The effective learning rate is then scaled by d_k, ensuring the step size adapts to the problem geometry.