Implementation:Dotnet Machinelearning OneDalAlgorithms
| Knowledge Sources | |
|---|---|
| Domains | Machine Learning, Native Interop, Hardware Acceleration |
| Last Updated | 2026-02-09 12:00 GMT |
Overview
C++ wrapper library exposing Intel oneDAL (oneAPI Data Analytics Library) accelerated implementations of Decision Forest, Logistic Regression, and Ridge Regression algorithms for consumption by the ML.NET managed runtime via P/Invoke.
Description
OneDalAlgorithms.cpp provides a native bridge between ML.NET's managed C# training pipelines and Intel's oneDAL library, which delivers hardware-optimized implementations of core machine learning algorithms. The file exports five primary functions that cover three algorithm families:
- Decision Forest (classification and regression) -- Trains an ensemble of decision trees using oneDAL's optimized parallel tree construction. Results are extracted via custom RegressorNodeVisitor and ClassifierNodeVisitor helper classes that traverse the trained tree structure and serialize node data (split features, thresholds, leaf values) back to managed memory.
- Logistic Regression (L-BFGS optimizer) -- Trains a logistic regression model using the Limited-memory Broyden-Fletcher-Goldfarb-Shanno optimization algorithm, which is well-suited for large feature spaces where full Hessian computation is impractical.
- Ridge Regression (online/streaming) -- Supports incremental training through a two-phase API: ridgeRegressionOnlineCompute accepts data in batches (partial results are accumulated internally), and ridgeRegressionOnlineFinalize merges partial results to produce the final model coefficients.
All exported functions are templated to support both float (single precision) and double (double precision) data types, allowing the managed layer to choose the appropriate precision for the workload.
Usage
Use these native functions when training ML.NET models on Intel hardware where oneDAL acceleration is available. The managed trainers (e.g., FastForestRegressionTrainer, FastForestBinaryClassificationTrainer) automatically detect oneDAL availability and dispatch to these native functions when beneficial. This is most effective for:
- Large datasets where vectorized computation yields significant speedups
- Decision forest training with many trees and deep splits
- Ridge regression on streaming data that arrives in batches
Code Reference
Source Location
- Repository: Dotnet_Machinelearning
- File: src/Native/OneDalNative/OneDalAlgorithms.cpp
- Lines: 1-824
Signature
// Decision Forest Regression
EXPORT_API(void) decisionForestRegressionCompute(
int numColumns,
int numRows,
float* dataPtr,
float* labelsPtr,
int numTrees,
int maxTreeDepth,
int minObservationsInLeafNode,
int maxBins,
int seed,
float* featureImportancePtr,
int* treeNodeCount,
float** treeNodeSplitValues,
int** treeNodeFeatureIndices,
float** treeNodeLeafValues,
int** treeNodeLeftChildren,
int** treeNodeRightChildren
);
// Decision Forest Classification
EXPORT_API(void) decisionForestClassificationCompute(
int numColumns,
int numRows,
float* dataPtr,
float* labelsPtr,
int numClasses,
int numTrees,
int maxTreeDepth,
int minObservationsInLeafNode,
int maxBins,
int seed,
float* featureImportancePtr,
int* treeNodeCount,
float** treeNodeSplitValues,
int** treeNodeFeatureIndices,
float** treeNodeLeafValues,
int** treeNodeLeftChildren,
int** treeNodeRightChildren
);
// Logistic Regression (L-BFGS)
EXPORT_API(void) logisticRegressionLBFGSCompute(
int numColumns,
int numRows,
float* dataPtr,
float* labelsPtr,
int numClasses,
float l1Regularization,
float l2Regularization,
int maxIterations,
float* weightsPtr,
float* biasPtr
);
// Ridge Regression Online (batch)
EXPORT_API(void*) ridgeRegressionOnlineCompute(
int numColumns,
int numRows,
float* dataPtr,
float* labelsPtr,
float l2Regularization,
void* partialResult
);
// Ridge Regression Online (finalize)
EXPORT_API(void) ridgeRegressionOnlineFinalize(
int numColumns,
void* partialResult,
float* weightsPtr,
float* biasPtr
);
Import
// P/Invoke declarations (managed side)
[DllImport("OneDalNative", EntryPoint = "decisionForestRegressionCompute")]
internal static extern void DecisionForestRegressionCompute(
int numColumns, int numRows,
IntPtr dataPtr, IntPtr labelsPtr,
int numTrees, int maxTreeDepth,
int minObservationsInLeafNode, int maxBins, int seed,
IntPtr featureImportancePtr,
IntPtr treeNodeCount, IntPtr treeNodeSplitValues,
IntPtr treeNodeFeatureIndices, IntPtr treeNodeLeafValues,
IntPtr treeNodeLeftChildren, IntPtr treeNodeRightChildren);
[DllImport("OneDalNative", EntryPoint = "ridgeRegressionOnlineCompute")]
internal static extern IntPtr RidgeRegressionOnlineCompute(
int numColumns, int numRows,
IntPtr dataPtr, IntPtr labelsPtr,
float l2Regularization, IntPtr partialResult);
[DllImport("OneDalNative", EntryPoint = "ridgeRegressionOnlineFinalize")]
internal static extern void RidgeRegressionOnlineFinalize(
int numColumns, IntPtr partialResult,
IntPtr weightsPtr, IntPtr biasPtr);
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| numColumns | int | Yes | Number of feature columns in the training data |
| numRows | int | Yes | Number of rows (instances) in the current data batch |
| dataPtr | float*/double* | Yes | Pointer to row-major feature matrix of size numRows x numColumns |
| labelsPtr | float*/double* | Yes | Pointer to label array of size numRows |
| numTrees | int | Yes (forest) | Number of trees to build in the ensemble |
| maxTreeDepth | int | Yes (forest) | Maximum depth of each decision tree |
| minObservationsInLeafNode | int | Yes (forest) | Minimum number of samples required at a leaf node |
| maxBins | int | Yes (forest) | Maximum number of bins for histogram-based splitting |
| seed | int | Yes (forest) | Random seed for reproducibility |
| numClasses | int | Yes (classification) | Number of target classes |
| l1Regularization | float | Yes (logistic) | L1 regularization coefficient for sparsity |
| l2Regularization | float | Yes (logistic/ridge) | L2 regularization coefficient for weight shrinkage |
| maxIterations | int | Yes (logistic) | Maximum number of L-BFGS iterations |
| partialResult | void* | No (ridge) | Pointer to accumulated partial results from previous batches; NULL for first batch |
Outputs
| Name | Type | Description |
|---|---|---|
| featureImportancePtr | float* | Array of feature importance scores (forest algorithms) |
| treeNodeCount | int* | Number of nodes per tree in the ensemble |
| treeNodeSplitValues | float** | Split threshold values for each internal node |
| treeNodeFeatureIndices | int** | Feature index used for splitting at each internal node |
| treeNodeLeafValues | float** | Predicted values at leaf nodes |
| treeNodeLeftChildren | int** | Left child node indices for tree traversal |
| treeNodeRightChildren | int** | Right child node indices for tree traversal |
| weightsPtr | float* | Trained model weight vector (logistic/ridge regression) |
| biasPtr | float* | Trained model bias/intercept term (logistic/ridge regression) |
| return (ridgeOnlineCompute) | void* | Opaque pointer to accumulated partial results for subsequent batches |
Helper Classes
RegressorNodeVisitor
Traverses trained regression decision trees produced by oneDAL and extracts node information (split feature, split value, leaf prediction) into flat arrays suitable for marshalling back to managed code.
ClassifierNodeVisitor
Traverses trained classification decision trees and extracts node information including class probability distributions at leaf nodes.
Both visitors implement the oneDAL TreeNodeVisitor interface and are invoked by the model.traverseDF() method after training completes.
Usage Examples
// Training a decision forest regressor via the managed trainer
// (which internally calls decisionForestRegressionCompute)
var pipeline = mlContext.Transforms.Concatenate("Features", featureColumns)
.Append(mlContext.Regression.Trainers.FastForest(
numberOfTrees: 100,
maximumTreeDepth: 16,
minimumExampleCountPerLeaf: 5));
var model = pipeline.Fit(trainingData);
// The FastForest trainer detects Intel oneDAL availability
// and dispatches to the native OneDalAlgorithms functions
// for hardware-accelerated training.
// Ridge regression with streaming batches
// (internally uses ridgeRegressionOnlineCompute + ridgeRegressionOnlineFinalize)
var pipeline = mlContext.Transforms.Concatenate("Features", featureColumns)
.Append(mlContext.Regression.Trainers.OnlineRidgeRegression(
l2Regularization: 0.1f));
// Each batch calls ridgeRegressionOnlineCompute with partial results
// Final call to ridgeRegressionOnlineFinalize produces the model
var model = pipeline.Fit(trainingData);