Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Microsoft Onnxruntime TrainingUtil Declarations

From Leeroopedia


Knowledge Sources
Domains Training, Models, Utilities
Last Updated 2026-02-10 04:00 GMT

Overview

Declares training utility classes including DataSet, RandomDataSet, TrainingUtil, LossScaler, LearningRateScheduler, and multiple LR schedule implementations (NoWarmup, Cosine, Constant, Linear, Poly).

Description

The `training_util.h` header provides the full declaration surface for the ORT Training model runner utilities:

  • DataSet: A container for training samples. Each sample is a `unique_ptr<vector<OrtValue>>` (typedef `SampleType`). Supports adding data from raw OrtValues or TensorProtos, batching via `GetKthBatch`, random shuffling, and dimension extraction from input shapes via `GetTensorDimensionsFromInputs`. Includes a `RETURN_IF_FAIL` convenience macro for error handling.
  • RandomDataSet: Extends DataSet to generate zero-filled random data of specified tensor shapes and types, useful for benchmarking without real data.
  • TrainingUtil: Provides static template methods `CreateCpuMLValue` and `CreateCpuMLScalar` for constructing CPU-backed OrtValues from vectors and scalars. Also includes `GetCpuAllocator()`, `PrintNameMLValMap`, and `PrintTensor` for debugging.
  • LearningRateParameters: Configuration struct with `initial_lr`, `warmup_ratio`, `warmup_mode` (schedule type string), and `feed_name`.
  • LossScaler: Dynamic loss scaling for mixed-precision training with configurable up-scale window, min/max bounds, and checkpoint serialization.
  • LearningRateScheduler: Abstract base class computing `lr = initial_lr * GetLearningRateFactor(cur_ratio, warmup_ratio)`. Five concrete implementations:
 - `NoWarmpScheduler`: Returns constant factor of 1.0.
 - `CosineScheduler`: Linear warmup then cosine decay: `0.5 * (1 + cos(pi * cur_ratio))`.
 - `ConstantScheduler`: Linear warmup then constant factor of 1.0.
 - `LinearScheduler`: Linear warmup then linear decay to 0.
 - `PolyScheduler`: Linear warmup then polynomial decay: `(1 - cur_ratio)^0.5`.

Usage

Use this header when building custom training runners or when you need DataSet management, CPU tensor creation, loss scaling, or learning rate scheduling for ORT Training model runner applications.

Code Reference

Source Location

Signature

class DataSet {
 public:
  typedef std::unique_ptr<std::vector<OrtValue>> SampleType;
  DataSet(const std::vector<std::string>& tensor_names);
  virtual ~DataSet();
  size_t NumInputs() const;
  common::Status AddData(SampleType&& single_sample);
  common::Status AddData(const std::vector<ONNX_NAMESPACE::TensorProto>& features);
  virtual size_t NumSamples() const;
  size_t TotalBatch(size_t batch_size) const;
  virtual std::vector<OrtValue> GetKthBatch(size_t batch_size, size_t k_th,
                                            AllocatorPtr allocator = nullptr) const;
  void RandomShuffle();
};

class TrainingUtil {
 public:
  template <typename T>
  static void CreateCpuMLValue(gsl::span<const int64_t> dims,
                               const std::vector<T>& value, OrtValue* p_mlvalue,
                               AllocatorPtr alloc = nullptr);
  template <typename T>
  static void CreateCpuMLScalar(const T value, OrtValue* p_mlvalue,
                                AllocatorPtr alloc = nullptr);
  static AllocatorPtr GetCpuAllocator();
  static void PrintNameMLValMap(const NameMLValMap& mlvalue_map);
  static void PrintTensor(const std::string& name, const Tensor& tensor,
                          std::ostream& os = std::cout);
};

class LossScaler {
 public:
  LossScaler(const std::string loss_scale_input_name, bool is_dynamic_scale,
             float loss_scale = 65536.f, size_t up_scale_window = 2000,
             float min_loss_scale = 1.0f, float max_loss_scale = 16777216.f);
  void UpdateLossScale(bool is_all_finite);
  std::string SaveToString() const;
  Status LoadFromString(const std::string& input);
};

class LearningRateScheduler {
 public:
  float GetLearningRate(const size_t current_step) const;
  virtual float GetLearningRateFactor(float cur_ratio, float warmp_ratio) const = 0;
  static std::unique_ptr<LearningRateScheduler> Create(LearningRateParameters& lr_params,
                                                       size_t training_step_count);
};

Import

#include "orttraining/models/runner/training_util.h"

I/O Contract

Class Inputs Outputs Description
DataSet::GetKthBatch batch_size, k_th, allocator vector<OrtValue> Returns batched tensors for the k-th batch
TrainingUtil::CreateCpuMLValue dims, values, OrtValue* void Creates a CPU tensor OrtValue from a vector
LossScaler::UpdateLossScale is_all_finite (bool) void Dynamically adjusts loss scale based on gradient status
LearningRateScheduler::GetLearningRate current_step float Returns scheduled LR for the given step
CosineScheduler::GetLearningRateFactor cur_ratio, warmup_ratio float Cosine decay: 0.5*(1+cos(pi*ratio)) after warmup
PolyScheduler::GetLearningRateFactor cur_ratio, warmup_ratio float Polynomial decay: (1-ratio)^0.5 after warmup

Usage Examples

#include "orttraining/models/runner/training_util.h"

using namespace onnxruntime::training;

// Create a CPU MLValue
OrtValue value;
std::vector<float> data = {1.0f, 2.0f, 3.0f, 4.0f};
TrainingUtil::CreateCpuMLValue<float>({2, 2}, data, &value);

// Create a scalar
OrtValue scalar;
TrainingUtil::CreateCpuMLScalar<float>(0.001f, &scalar);

// LR scheduling
LearningRateParameters params{0.001f, 0.1f, "Cosine", "Learning_Rate"};
auto scheduler = LearningRateScheduler::Create(params, 10000);
float lr = scheduler->GetLearningRate(5000);  // cosine-decayed LR at step 5000

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment