Implementation:Microsoft Onnxruntime OrtTrainingPythonModule
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Training, Python, Bindings |
| Last Updated | 2026-02-10 04:00 GMT |
Overview
Implements the pybind11 module for ORT Training Python bindings, including execution provider management, environment initialization, and the PYBIND11_MODULE entry point.
Description
This source file (`orttraining_python_module.cc`) is the main entry point for the ORT Training Python package (`onnxruntime_pybind11_state`). It provides:
- ORTTrainingPythonEnv: A class that manages the ORT environment for training Python sessions. It maintains:
- A list of available training execution providers (built-in + dynamically registered). - An execution provider instance cache (keyed by provider type + hash) for reuse across sessions. - External EP registration info for dynamic provider loading.
- Execution Provider Hashing: `GetProviderInstanceHash` computes a hash for CPU (always 0), CUDA (based on `CUDAExecutionProviderInfo`), and dynamic EPs (via a `ProviderHashFunc` loaded from shared libraries). This enables EP instance caching.
- Provider Resolution: `ResolveExtraProviderOptions` merges registered external EP options with user-provided options. `GetOrCreateExecutionProvider` checks the cache before creating new EP instances.
- Environment Creation: `CreateOrtEnv` initializes the ORT environment with Python telemetry projection, WARNING log level, and optional global thread pool. It also initializes the shared provider bridge on non-Apple, non-minimal builds.
- PYBIND11_MODULE: The module entry point registers:
- Global methods, object methods, OrtValue methods, sparse tensor methods, IO binding methods, adapter format methods, and training-specific methods. - `_register_provider_lib`: Registers external EP shared libraries. - `get_available_providers`: Returns available EP types. - `get_version_string` / `get_build_info`: Version queries. - `clear_training_ep_instances`: Clears cached EP instances. - `has_collective_ops`: Returns whether MPI+NCCL collective ops are available. - Lazy tensor methods (when ENABLE_LAZY_TENSOR is defined). - Quantization module.
Usage
This module is automatically loaded when importing `onnxruntime.training` in Python. It provides the low-level C++ bindings that the Python training APIs are built upon.
Code Reference
Source Location
- Repository: Microsoft_Onnxruntime
- File: orttraining/orttraining/python/orttraining_python_module.cc
- Lines: 1-327
Signature
class ORTTrainingPythonEnv {
public:
ORTTrainingPythonEnv(OrtEnvPtr ort_env);
const OrtEnv& GetORTEnv() const;
OrtEnv& GetORTEnv();
std::shared_ptr<IExecutionProvider> GetExecutionProviderInstance(
const std::string& provider_type, size_t hash);
void AddExecutionProvider(const std::string& provider_type, size_t hash,
std::unique_ptr<IExecutionProvider> execution_provider);
void RegisterExtExecutionProviderInfo(const std::string& provider_type,
const std::string& provider_lib_path,
const ProviderOptions& default_options);
const std::vector<std::string>& GetAvailableTrainingExecutionProviderTypes();
void ClearExecutionProviderInstances();
};
PYBIND11_MODULE(onnxruntime_pybind11_state, m);
void ORTTrainingRegisterExecutionProviders(InferenceSession* sess,
const std::vector<std::string>& provider_types,
const ProviderOptionsMap& provider_options_map);
Import
#include "orttraining/python/orttraining_pybind_common.h"
#include "python/onnxruntime_pybind_mlvalue.h"
I/O Contract
| Function/Method | Inputs | Outputs | Description |
|---|---|---|---|
| PYBIND11_MODULE | (module initialization) | Python module | Creates the onnxruntime_pybind11_state training module |
| GetOrCreateExecutionProvider | provider_type, options, session_options | shared_ptr<IExecutionProvider> | Returns cached or newly created EP instance |
| _register_provider_lib | name, shared_lib_path, default_options | void | Registers an external EP shared library for dynamic loading |
| get_available_providers | (none) | vector<string> | Returns list of available execution provider names |
| has_collective_ops | (none) | bool | Returns true if MPI+NCCL are available |
Usage Examples
# Python usage (this module is loaded automatically)
import onnxruntime as ort
# Get available providers
providers = ort.get_available_providers()
# Register external EP
ort._register_provider_lib("MyCustomEP", "/path/to/libcustom_ep.so", {})
# Check collective ops
has_collectives = ort.has_collective_ops()
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment