Implementation:FMInference FlexLLMGen DeepSpeed LR Schedules

Field	Value
Sources	Repo: FlexLLMGen, Upstream: DeepSpeed
Domains	Training_Optimization, Learning_Rate_Scheduling
Last Updated	2026-02-09 00:00 GMT

Overview

Vendored DeepSpeed module implementing learning rate schedulers including LR Range Test, OneCycle, WarmupLR, and WarmupDecayLR, with argument parsing and configuration override support.

Description

The lr_schedules.py file (854 lines) is a vendored copy of DeepSpeed's learning rate scheduler implementations, adapted from PyTorch v1.0.1. It provides four scheduler types and infrastructure for configuration-driven scheduler selection.

Key components include:

LRRangeTest -- Increases learning rate linearly or in a staircase pattern from a minimum value, used to find the optimal learning rate range for a model. Controlled by lr_range_test_min_lr, lr_range_test_step_rate, lr_range_test_step_size, and lr_range_test_staircase.

OneCycle -- Implements the 1Cycle policy with two phases: an ascending phase from min to max LR, a descending phase back to min, and an optional decay phase. Also supports cyclical momentum scheduling (inversely correlated with LR). Controlled by cycle_min_lr, cycle_max_lr, cycle_first_step_size, cycle_second_step_size, decay_lr_rate, and corresponding momentum parameters.

WarmupLR -- Linear or logarithmic warmup from a minimum LR to a maximum LR over a specified number of steps. After warmup, the LR remains constant. Controlled by warmup_min_lr, warmup_max_lr, warmup_num_steps, and warmup_type (log or linear).

WarmupDecayLR -- Extends WarmupLR with linear decay after warmup, decreasing from max LR to the min LR over the remaining total_num_steps.

Supporting infrastructure:

add_tuning_arguments -- Adds all LR schedule arguments to an argparse parser.
override_*_params -- Functions that merge command-line arguments with configuration dictionary values, with CLI args taking precedence.

Usage

Schedulers are selected via the scheduler.type field in the DeepSpeed JSON configuration. The engine creates the appropriate scheduler during initialization. This module is part of the vendored benchmark dependencies in FlexLLMGen.

Code Reference

Field	Value
Repository	FlexLLMGen
File	benchmark/third_party/DeepSpeed/deepspeed/runtime/lr_schedules.py
Lines	1-854
Type	AUTO_KEEP (vendored dependency)

Key constants:

LR_RANGE_TEST = 'LRRangeTest'
ONE_CYCLE = 'OneCycle'
WARMUP_LR = 'WarmupLR'
WARMUP_DECAY_LR = 'WarmupDecayLR'
VALID_LR_SCHEDULES = [LR_RANGE_TEST, ONE_CYCLE, WARMUP_LR, WARMUP_DECAY_LR]

I/O Contract

Inputs

Parameter	Type	Required	Description
optimizer	torch.optim.Optimizer	Yes	The optimizer whose learning rate will be scheduled
warmup_min_lr	float	No	Starting LR for warmup phase (default: 0)
warmup_max_lr	float	No	Target LR after warmup (default: 0.001)
warmup_num_steps	int	No	Number of warmup steps (default: 1000)
warmup_type	str	No	Warmup curve type: 'log' or 'linear' (default: 'log')
total_num_steps	int	No	Total training steps for decay scheduling

Outputs

Output	Type	Description
scheduler	_LRScheduler	PyTorch-compatible LR scheduler that steps with the optimizer
learning rate	float	Current learning rate at each training step

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment