Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Microsoft Onnxruntime OrtTrainingPythonModule

From Leeroopedia


Knowledge Sources
Domains Training, Python, Bindings
Last Updated 2026-02-10 04:00 GMT

Overview

Implements the pybind11 module for ORT Training Python bindings, including execution provider management, environment initialization, and the PYBIND11_MODULE entry point.

Description

This source file (`orttraining_python_module.cc`) is the main entry point for the ORT Training Python package (`onnxruntime_pybind11_state`). It provides:

  • ORTTrainingPythonEnv: A class that manages the ORT environment for training Python sessions. It maintains:
 - A list of available training execution providers (built-in + dynamically registered).
 - An execution provider instance cache (keyed by provider type + hash) for reuse across sessions.
 - External EP registration info for dynamic provider loading.
  • Execution Provider Hashing: `GetProviderInstanceHash` computes a hash for CPU (always 0), CUDA (based on `CUDAExecutionProviderInfo`), and dynamic EPs (via a `ProviderHashFunc` loaded from shared libraries). This enables EP instance caching.
  • Provider Resolution: `ResolveExtraProviderOptions` merges registered external EP options with user-provided options. `GetOrCreateExecutionProvider` checks the cache before creating new EP instances.
  • Environment Creation: `CreateOrtEnv` initializes the ORT environment with Python telemetry projection, WARNING log level, and optional global thread pool. It also initializes the shared provider bridge on non-Apple, non-minimal builds.
  • PYBIND11_MODULE: The module entry point registers:
 - Global methods, object methods, OrtValue methods, sparse tensor methods, IO binding methods, adapter format methods, and training-specific methods.
 - `_register_provider_lib`: Registers external EP shared libraries.
 - `get_available_providers`: Returns available EP types.
 - `get_version_string` / `get_build_info`: Version queries.
 - `clear_training_ep_instances`: Clears cached EP instances.
 - `has_collective_ops`: Returns whether MPI+NCCL collective ops are available.
 - Lazy tensor methods (when ENABLE_LAZY_TENSOR is defined).
 - Quantization module.

Usage

This module is automatically loaded when importing `onnxruntime.training` in Python. It provides the low-level C++ bindings that the Python training APIs are built upon.

Code Reference

Source Location

Signature

class ORTTrainingPythonEnv {
 public:
  ORTTrainingPythonEnv(OrtEnvPtr ort_env);
  const OrtEnv& GetORTEnv() const;
  OrtEnv& GetORTEnv();
  std::shared_ptr<IExecutionProvider> GetExecutionProviderInstance(
      const std::string& provider_type, size_t hash);
  void AddExecutionProvider(const std::string& provider_type, size_t hash,
                            std::unique_ptr<IExecutionProvider> execution_provider);
  void RegisterExtExecutionProviderInfo(const std::string& provider_type,
                                        const std::string& provider_lib_path,
                                        const ProviderOptions& default_options);
  const std::vector<std::string>& GetAvailableTrainingExecutionProviderTypes();
  void ClearExecutionProviderInstances();
};

PYBIND11_MODULE(onnxruntime_pybind11_state, m);

void ORTTrainingRegisterExecutionProviders(InferenceSession* sess,
    const std::vector<std::string>& provider_types,
    const ProviderOptionsMap& provider_options_map);

Import

#include "orttraining/python/orttraining_pybind_common.h"
#include "python/onnxruntime_pybind_mlvalue.h"

I/O Contract

Function/Method Inputs Outputs Description
PYBIND11_MODULE (module initialization) Python module Creates the onnxruntime_pybind11_state training module
GetOrCreateExecutionProvider provider_type, options, session_options shared_ptr<IExecutionProvider> Returns cached or newly created EP instance
_register_provider_lib name, shared_lib_path, default_options void Registers an external EP shared library for dynamic loading
get_available_providers (none) vector<string> Returns list of available execution provider names
has_collective_ops (none) bool Returns true if MPI+NCCL are available

Usage Examples

# Python usage (this module is loaded automatically)
import onnxruntime as ort

# Get available providers
providers = ort.get_available_providers()

# Register external EP
ort._register_provider_lib("MyCustomEP", "/path/to/libcustom_ep.so", {})

# Check collective ops
has_collectives = ort.has_collective_ops()

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment