Principle:Facebookresearch Audiocraft D Adaptation Optimization

Knowledge Sources	D-Adaptation
Domains	Optimization, Training
Last Updated	2026-02-14 01:00 GMT

Overview

An automatic learning rate adaptation technique that estimates the distance to the optimal solution online, eliminating the need for manual learning rate tuning.

Description

D-Adaptation methods modify standard optimizers (like Adam) to automatically determine the appropriate step size. By maintaining an online estimate of D (the initial distance to the solution in dual space), the optimizer can provably converge with a single hyperparameter-free learning rate setting.

Usage

Use this principle when training models where learning rate search is expensive or impractical. It provides theoretical convergence guarantees without manual tuning.

Theoretical Basis

The method maintains an estimate of D that is updated each step:

$d_{k} = \max (d_{k - 1}, \frac{| g_{k}^{T} (x_{k} - x_{0}) |}{\sum_{i = 0}^{k} | | g_{i} | |})$

The effective learning rate is then scaled by d_k, ensuring the step size adapts to the problem geometry.

Related Pages

Implementation:Facebookresearch_Audiocraft_DAdaptAdam

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment