Implementation:Dotnet Machinelearning Binary Classification Trainers
| Knowledge Sources | |
|---|---|
| Domains | Machine Learning, Classification, Supervised Learning |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete tools for training binary classification models using FastTree, LightGBM, and SDCA logistic regression algorithms, provided by ML.NET.
Description
ML.NET exposes binary classification trainers through the BinaryClassificationCatalog.Trainers property on MLContext. Each trainer is an IEstimator<ITransformer> that can be appended to a feature engineering pipeline. Calling Fit on the pipeline trains the model end-to-end.
- FastTree implements gradient boosted decision trees using the FastRank algorithm. It is effective for medium-sized tabular datasets and provides built-in feature importance.
- LightGBM wraps the LightGBM library, which uses histogram-based splitting and leaf-wise tree growth for faster training and often better accuracy on large datasets.
- SdcaLogisticRegression implements stochastic dual coordinate ascent for L2-regularized logistic regression. It is efficient for high-dimensional sparse data and produces a linear model.
All trainers produce a calibrated model that outputs PredictedLabel (bool), Score (raw output), and Probability (calibrated [0,1] value).
Usage
Append a trainer to a feature engineering pipeline. Use FastTree or LightGBM for tree-based models on structured/tabular data. Use SdcaLogisticRegression for linear models on sparse or high-dimensional data. Install the corresponding NuGet packages for FastTree and LightGBM.
Code Reference
Source Location
- Repository: ML.NET
- File:
src/Microsoft.ML.FastTree/FastTree.cs:L30+ - File:
src/Microsoft.ML.LightGbm/LightGbmBinaryTrainer.cs:L30+ - File:
src/Microsoft.ML.StandardTrainers/Standard/SdcaBinary.cs:L30+
Signature
// Gradient boosted decision trees (FastTree / FastRank)
public FastTreeBinaryTrainer FastTree(
string labelColumnName = "Label",
string featureColumnName = "Features",
string exampleWeightColumnName = null,
int numberOfLeaves = 20,
int numberOfTrees = 100,
int minimumExampleCountPerLeaf = 10,
double learningRate = 0.2)
// LightGBM gradient boosting
public LightGbmBinaryTrainer LightGbm(
string labelColumnName = "Label",
string featureColumnName = "Features",
string exampleWeightColumnName = null,
int? numberOfLeaves = null,
int? minimumExampleCountPerLeaf = null,
double? learningRate = null,
int numberOfIterations = 100)
// SDCA logistic regression (linear model)
public SdcaLogisticRegressionBinaryTrainer SdcaLogisticRegression(
string labelColumnName = "Label",
string featureColumnName = "Features",
string exampleWeightColumnName = null,
float? l2Regularization = null,
float? l1Regularization = null,
int? maximumNumberOfIterations = null)
// Fit the pipeline to produce a trained model
ITransformer Fit(IDataView data)
Import
using Microsoft.ML;
// Additional imports for specific trainers:
using Microsoft.ML.Trainers.FastTree; // for FastTree
using Microsoft.ML.Trainers.LightGbm; // for LightGBM
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| labelColumnName | string | No | Name of the boolean label column. Default: "Label". |
| featureColumnName | string | No | Name of the feature vector column. Default: "Features". |
| exampleWeightColumnName | string | No | Name of the example weight column. Default: null (uniform weights). |
| numberOfLeaves | int | No | Maximum number of leaves per tree (FastTree/LightGBM). Default: 20 (FastTree). |
| numberOfTrees / numberOfIterations | int | No | Number of boosting rounds. Default: 100. |
| learningRate | double | No | Step size for boosting. Default: 0.2 (FastTree). |
| minimumExampleCountPerLeaf | int | No | Minimum training examples required per leaf. Default: 10 (FastTree). |
| l2Regularization | float? | No | L2 regularization weight for SDCA. Default: auto-tuned. |
| l1Regularization | float? | No | L1 regularization weight for SDCA. Default: auto-tuned. |
| data (Fit) | IDataView | Yes | Training data with label and features columns. |
Outputs
| Name | Type | Description |
|---|---|---|
| (trainer return) | IEstimator<ITransformer> | Estimator that can be appended to a pipeline and fitted. |
| (Fit return) | ITransformer | Trained model that produces PredictedLabel (bool), Score (float), and Probability (float) columns. |
Usage Examples
Basic Example
using Microsoft.ML;
using Microsoft.ML.Data;
public class ModelInput
{
[LoadColumn(0)] public float Feature1 { get; set; }
[LoadColumn(1)] public float Feature2 { get; set; }
[LoadColumn(2)] public float Feature3 { get; set; }
[LoadColumn(3), ColumnName("Label")] public bool Label { get; set; }
}
var mlContext = new MLContext(seed: 42);
var data = mlContext.Data.LoadFromTextFile<ModelInput>(
"data.csv", separatorChar: ',', hasHeader: true);
var split = mlContext.Data.TrainTestSplit(data, testFraction: 0.2);
// Option 1: FastTree
var pipelineFastTree = mlContext.Transforms
.Concatenate("Features", "Feature1", "Feature2", "Feature3")
.Append(mlContext.BinaryClassification.Trainers.FastTree(
numberOfLeaves: 30,
numberOfTrees: 200,
learningRate: 0.1));
var modelFastTree = pipelineFastTree.Fit(split.TrainSet);
// Option 2: LightGBM
var pipelineLgbm = mlContext.Transforms
.Concatenate("Features", "Feature1", "Feature2", "Feature3")
.Append(mlContext.BinaryClassification.Trainers.LightGbm(
numberOfLeaves: 31,
learningRate: 0.05,
numberOfIterations: 300));
var modelLgbm = pipelineLgbm.Fit(split.TrainSet);
// Option 3: SDCA logistic regression
var pipelineSdca = mlContext.Transforms
.Concatenate("Features", "Feature1", "Feature2", "Feature3")
.Append(mlContext.BinaryClassification.Trainers.SdcaLogisticRegression(
maximumNumberOfIterations: 100));
var modelSdca = pipelineSdca.Fit(split.TrainSet);
Related Pages
Implements Principle
Requires Environment
- Environment:Dotnet_Machinelearning_Dotnet_SDK_And_Runtime
- Environment:Dotnet_Machinelearning_Native_Build_Toolchain
- Environment:Dotnet_Machinelearning_Platform_Architecture_Support
- Environment:Dotnet_Machinelearning_OneDal_Acceleration