Implementation:FMInference FlexLLMGen DeepSpeed LR Schedules
| Field | Value |
|---|---|
| Sources | Repo: FlexLLMGen, Upstream: DeepSpeed |
| Domains | Training_Optimization, Learning_Rate_Scheduling |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Vendored DeepSpeed module implementing learning rate schedulers including LR Range Test, OneCycle, WarmupLR, and WarmupDecayLR, with argument parsing and configuration override support.
Description
The lr_schedules.py file (854 lines) is a vendored copy of DeepSpeed's learning rate scheduler implementations, adapted from PyTorch v1.0.1. It provides four scheduler types and infrastructure for configuration-driven scheduler selection.
Key components include:
- LRRangeTest -- Increases learning rate linearly or in a staircase pattern from a minimum value, used to find the optimal learning rate range for a model. Controlled by lr_range_test_min_lr, lr_range_test_step_rate, lr_range_test_step_size, and lr_range_test_staircase.
- OneCycle -- Implements the 1Cycle policy with two phases: an ascending phase from min to max LR, a descending phase back to min, and an optional decay phase. Also supports cyclical momentum scheduling (inversely correlated with LR). Controlled by cycle_min_lr, cycle_max_lr, cycle_first_step_size, cycle_second_step_size, decay_lr_rate, and corresponding momentum parameters.
- WarmupLR -- Linear or logarithmic warmup from a minimum LR to a maximum LR over a specified number of steps. After warmup, the LR remains constant. Controlled by warmup_min_lr, warmup_max_lr, warmup_num_steps, and warmup_type (log or linear).
- WarmupDecayLR -- Extends WarmupLR with linear decay after warmup, decreasing from max LR to the min LR over the remaining total_num_steps.
Supporting infrastructure:
- add_tuning_arguments -- Adds all LR schedule arguments to an argparse parser.
- override_*_params -- Functions that merge command-line arguments with configuration dictionary values, with CLI args taking precedence.
Usage
Schedulers are selected via the scheduler.type field in the DeepSpeed JSON configuration. The engine creates the appropriate scheduler during initialization. This module is part of the vendored benchmark dependencies in FlexLLMGen.
Code Reference
| Field | Value |
|---|---|
| Repository | FlexLLMGen |
| File | benchmark/third_party/DeepSpeed/deepspeed/runtime/lr_schedules.py |
| Lines | 1-854 |
| Type | AUTO_KEEP (vendored dependency) |
Key constants:
LR_RANGE_TEST = 'LRRangeTest'
ONE_CYCLE = 'OneCycle'
WARMUP_LR = 'WarmupLR'
WARMUP_DECAY_LR = 'WarmupDecayLR'
VALID_LR_SCHEDULES = [LR_RANGE_TEST, ONE_CYCLE, WARMUP_LR, WARMUP_DECAY_LR]
I/O Contract
Inputs
| Parameter | Type | Required | Description |
|---|---|---|---|
| optimizer | torch.optim.Optimizer | Yes | The optimizer whose learning rate will be scheduled |
| warmup_min_lr | float | No | Starting LR for warmup phase (default: 0) |
| warmup_max_lr | float | No | Target LR after warmup (default: 0.001) |
| warmup_num_steps | int | No | Number of warmup steps (default: 1000) |
| warmup_type | str | No | Warmup curve type: 'log' or 'linear' (default: 'log') |
| total_num_steps | int | No | Total training steps for decay scheduling |
Outputs
| Output | Type | Description |
|---|---|---|
| scheduler | _LRScheduler | PyTorch-compatible LR scheduler that steps with the optimizer |
| learning rate | float | Current learning rate at each training step |