Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:FMInference FlexLLMGen DeepSpeed LR Schedules

From Leeroopedia


Field Value
Sources Repo: FlexLLMGen, Upstream: DeepSpeed
Domains Training_Optimization, Learning_Rate_Scheduling
Last Updated 2026-02-09 00:00 GMT

Overview

Vendored DeepSpeed module implementing learning rate schedulers including LR Range Test, OneCycle, WarmupLR, and WarmupDecayLR, with argument parsing and configuration override support.

Description

The lr_schedules.py file (854 lines) is a vendored copy of DeepSpeed's learning rate scheduler implementations, adapted from PyTorch v1.0.1. It provides four scheduler types and infrastructure for configuration-driven scheduler selection.

Key components include:

  • LRRangeTest -- Increases learning rate linearly or in a staircase pattern from a minimum value, used to find the optimal learning rate range for a model. Controlled by lr_range_test_min_lr, lr_range_test_step_rate, lr_range_test_step_size, and lr_range_test_staircase.
  • OneCycle -- Implements the 1Cycle policy with two phases: an ascending phase from min to max LR, a descending phase back to min, and an optional decay phase. Also supports cyclical momentum scheduling (inversely correlated with LR). Controlled by cycle_min_lr, cycle_max_lr, cycle_first_step_size, cycle_second_step_size, decay_lr_rate, and corresponding momentum parameters.
  • WarmupLR -- Linear or logarithmic warmup from a minimum LR to a maximum LR over a specified number of steps. After warmup, the LR remains constant. Controlled by warmup_min_lr, warmup_max_lr, warmup_num_steps, and warmup_type (log or linear).
  • WarmupDecayLR -- Extends WarmupLR with linear decay after warmup, decreasing from max LR to the min LR over the remaining total_num_steps.

Supporting infrastructure:

  • add_tuning_arguments -- Adds all LR schedule arguments to an argparse parser.
  • override_*_params -- Functions that merge command-line arguments with configuration dictionary values, with CLI args taking precedence.

Usage

Schedulers are selected via the scheduler.type field in the DeepSpeed JSON configuration. The engine creates the appropriate scheduler during initialization. This module is part of the vendored benchmark dependencies in FlexLLMGen.

Code Reference

Field Value
Repository FlexLLMGen
File benchmark/third_party/DeepSpeed/deepspeed/runtime/lr_schedules.py
Lines 1-854
Type AUTO_KEEP (vendored dependency)

Key constants:

LR_RANGE_TEST = 'LRRangeTest'
ONE_CYCLE = 'OneCycle'
WARMUP_LR = 'WarmupLR'
WARMUP_DECAY_LR = 'WarmupDecayLR'
VALID_LR_SCHEDULES = [LR_RANGE_TEST, ONE_CYCLE, WARMUP_LR, WARMUP_DECAY_LR]

I/O Contract

Inputs

Parameter Type Required Description
optimizer torch.optim.Optimizer Yes The optimizer whose learning rate will be scheduled
warmup_min_lr float No Starting LR for warmup phase (default: 0)
warmup_max_lr float No Target LR after warmup (default: 0.001)
warmup_num_steps int No Number of warmup steps (default: 1000)
warmup_type str No Warmup curve type: 'log' or 'linear' (default: 'log')
total_num_steps int No Total training steps for decay scheduling

Outputs

Output Type Description
scheduler _LRScheduler PyTorch-compatible LR scheduler that steps with the optimizer
learning rate float Current learning rate at each training step

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment