Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Deepspeedai DeepSpeed CPU Adagrad Header

From Leeroopedia


Knowledge Sources
Domains Optimization, Deep Learning, CPU Computing
Last Updated 2026-02-09 00:00 GMT

Overview

Header file defining the SIMD-accelerated Adagrad optimizer class with AVX2/AVX512 support for CPU-based training.

Description

This header defines the Adagrad_Optimizer class with template-based SIMD implementations for efficient adaptive gradient descent on CPU architectures. The class provides Step_AVX template methods that utilize AVX intrinsics for vectorized operations, supporting span factors (1, 4, 8) to handle different batch sizes efficiently. The implementation includes automatic detection of AVX512/AVX256 capabilities and supports mixed precision training with FP16, BFloat16, and FP32 types.

Usage

Include this header when implementing or extending CPU-based Adagrad optimization with SIMD acceleration capabilities.

Code Reference

Source Location

Signature

class Adagrad_Optimizer {
public:
    Adagrad_Optimizer(float alpha = 1e-2, float eps = 1e-8, float weight_decay = 0);
    ~Adagrad_Optimizer();

#if defined(__AVX512__) or defined(__AVX256__)
    template <int span, typename ds_params_precision_t, typename ds_state_precision_t>
    void Step_AVX(size_t* rounded_size,
                  ds_params_precision_t* _params,
                  ds_params_precision_t* grads,
                  ds_state_precision_t* _exp_avg_sq,
                  size_t param_size);
#endif

    template <typename ds_params_precision_t, typename ds_state_precision_t>
    void Step_1(ds_params_precision_t* _params,
                ds_params_precision_t* grads,
                ds_state_precision_t* _exp_avg_sq,
                size_t _param_size);

    template <typename ds_params_precision_t, typename ds_state_precision_t>
    void Step_4(ds_params_precision_t* _params,
                ds_params_precision_t* grads,
                ds_state_precision_t* _exp_avg_sq,
                size_t _param_size);

    template <typename ds_params_precision_t, typename ds_state_precision_t>
    void Step_8(ds_params_precision_t* _params,
                ds_params_precision_t* grads,
                ds_state_precision_t* _exp_avg_sq,
                size_t _param_size);

    inline void IncrementStep(size_t step);
    inline void update_state(float lr, float epsilon, float weight_decay);

private:
    float _alpha;
    float _eps;
    float _weight_decay;
    float _betta1_t;
    float _betta2_t;
    size_t _step;
};

Import

#include "cpu_adagrad.h"
#include "simd.h"

I/O Contract

Constructor Parameters

Parameter Type Description
alpha float Learning rate (default: 1e-2)
eps float Small constant for numerical stability (default: 1e-8)
weight_decay float Weight decay coefficient (default: 0)

Step_AVX Template Parameters

Parameter Type Description
span int SIMD vector span factor (1, 4, or 8)
rounded_size size_t* Output: number of elements processed with SIMD
_params ds_params_precision_t* Model parameters array (in/out)
grads ds_params_precision_t* Gradients array (in)
_exp_avg_sq ds_state_precision_t* Accumulated squared gradients (in/out)
param_size size_t Total number of parameters

Supported Type Combinations

Parameters Type State Type Description
c10::Half float FP16 parameters, FP32 state
c10::Half c10::Half FP16 parameters, FP16 state
c10::BFloat16 float BF16 parameters, FP32 state (AVX512 only)
c10::BFloat16 c10::BFloat16 BF16 parameters, BF16 state (AVX512 only)
float float FP32 parameters, FP32 state

Usage Examples

#include "cpu_adagrad.h"

// Create Adagrad optimizer instance
Adagrad_Optimizer opt(
    /* alpha = */ 0.01,
    /* eps = */ 1e-8,
    /* weight_decay = */ 0.0
);

// Update state for current step
opt.IncrementStep(1);
opt.update_state(0.01, 1e-8, 0.0);

// Execute optimizer step with FP32
size_t param_size = 1024;
float* params = new float[param_size];
float* grads = new float[param_size];
float* exp_avg_sq = new float[param_size];

opt.Step_8(params, grads, exp_avg_sq, param_size);

delete[] params;
delete[] grads;
delete[] exp_avg_sq;

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment