Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Dotnet Machinelearning Binary Classification Trainers

From Leeroopedia


Knowledge Sources
Domains Machine Learning, Classification, Supervised Learning
Last Updated 2026-02-09 00:00 GMT

Overview

Concrete tools for training binary classification models using FastTree, LightGBM, and SDCA logistic regression algorithms, provided by ML.NET.

Description

ML.NET exposes binary classification trainers through the BinaryClassificationCatalog.Trainers property on MLContext. Each trainer is an IEstimator<ITransformer> that can be appended to a feature engineering pipeline. Calling Fit on the pipeline trains the model end-to-end.

  • FastTree implements gradient boosted decision trees using the FastRank algorithm. It is effective for medium-sized tabular datasets and provides built-in feature importance.
  • LightGBM wraps the LightGBM library, which uses histogram-based splitting and leaf-wise tree growth for faster training and often better accuracy on large datasets.
  • SdcaLogisticRegression implements stochastic dual coordinate ascent for L2-regularized logistic regression. It is efficient for high-dimensional sparse data and produces a linear model.

All trainers produce a calibrated model that outputs PredictedLabel (bool), Score (raw output), and Probability (calibrated [0,1] value).

Usage

Append a trainer to a feature engineering pipeline. Use FastTree or LightGBM for tree-based models on structured/tabular data. Use SdcaLogisticRegression for linear models on sparse or high-dimensional data. Install the corresponding NuGet packages for FastTree and LightGBM.

Code Reference

Source Location

  • Repository: ML.NET
  • File: src/Microsoft.ML.FastTree/FastTree.cs:L30+
  • File: src/Microsoft.ML.LightGbm/LightGbmBinaryTrainer.cs:L30+
  • File: src/Microsoft.ML.StandardTrainers/Standard/SdcaBinary.cs:L30+

Signature

// Gradient boosted decision trees (FastTree / FastRank)
public FastTreeBinaryTrainer FastTree(
    string labelColumnName = "Label",
    string featureColumnName = "Features",
    string exampleWeightColumnName = null,
    int numberOfLeaves = 20,
    int numberOfTrees = 100,
    int minimumExampleCountPerLeaf = 10,
    double learningRate = 0.2)

// LightGBM gradient boosting
public LightGbmBinaryTrainer LightGbm(
    string labelColumnName = "Label",
    string featureColumnName = "Features",
    string exampleWeightColumnName = null,
    int? numberOfLeaves = null,
    int? minimumExampleCountPerLeaf = null,
    double? learningRate = null,
    int numberOfIterations = 100)

// SDCA logistic regression (linear model)
public SdcaLogisticRegressionBinaryTrainer SdcaLogisticRegression(
    string labelColumnName = "Label",
    string featureColumnName = "Features",
    string exampleWeightColumnName = null,
    float? l2Regularization = null,
    float? l1Regularization = null,
    int? maximumNumberOfIterations = null)

// Fit the pipeline to produce a trained model
ITransformer Fit(IDataView data)

Import

using Microsoft.ML;

// Additional imports for specific trainers:
using Microsoft.ML.Trainers.FastTree;  // for FastTree
using Microsoft.ML.Trainers.LightGbm; // for LightGBM

I/O Contract

Inputs

Name Type Required Description
labelColumnName string No Name of the boolean label column. Default: "Label".
featureColumnName string No Name of the feature vector column. Default: "Features".
exampleWeightColumnName string No Name of the example weight column. Default: null (uniform weights).
numberOfLeaves int No Maximum number of leaves per tree (FastTree/LightGBM). Default: 20 (FastTree).
numberOfTrees / numberOfIterations int No Number of boosting rounds. Default: 100.
learningRate double No Step size for boosting. Default: 0.2 (FastTree).
minimumExampleCountPerLeaf int No Minimum training examples required per leaf. Default: 10 (FastTree).
l2Regularization float? No L2 regularization weight for SDCA. Default: auto-tuned.
l1Regularization float? No L1 regularization weight for SDCA. Default: auto-tuned.
data (Fit) IDataView Yes Training data with label and features columns.

Outputs

Name Type Description
(trainer return) IEstimator<ITransformer> Estimator that can be appended to a pipeline and fitted.
(Fit return) ITransformer Trained model that produces PredictedLabel (bool), Score (float), and Probability (float) columns.

Usage Examples

Basic Example

using Microsoft.ML;
using Microsoft.ML.Data;

public class ModelInput
{
    [LoadColumn(0)] public float Feature1 { get; set; }
    [LoadColumn(1)] public float Feature2 { get; set; }
    [LoadColumn(2)] public float Feature3 { get; set; }
    [LoadColumn(3), ColumnName("Label")] public bool Label { get; set; }
}

var mlContext = new MLContext(seed: 42);

var data = mlContext.Data.LoadFromTextFile<ModelInput>(
    "data.csv", separatorChar: ',', hasHeader: true);
var split = mlContext.Data.TrainTestSplit(data, testFraction: 0.2);

// Option 1: FastTree
var pipelineFastTree = mlContext.Transforms
    .Concatenate("Features", "Feature1", "Feature2", "Feature3")
    .Append(mlContext.BinaryClassification.Trainers.FastTree(
        numberOfLeaves: 30,
        numberOfTrees: 200,
        learningRate: 0.1));

var modelFastTree = pipelineFastTree.Fit(split.TrainSet);

// Option 2: LightGBM
var pipelineLgbm = mlContext.Transforms
    .Concatenate("Features", "Feature1", "Feature2", "Feature3")
    .Append(mlContext.BinaryClassification.Trainers.LightGbm(
        numberOfLeaves: 31,
        learningRate: 0.05,
        numberOfIterations: 300));

var modelLgbm = pipelineLgbm.Fit(split.TrainSet);

// Option 3: SDCA logistic regression
var pipelineSdca = mlContext.Transforms
    .Concatenate("Features", "Feature1", "Feature2", "Feature3")
    .Append(mlContext.BinaryClassification.Trainers.SdcaLogisticRegression(
        maximumNumberOfIterations: 100));

var modelSdca = pipelineSdca.Fit(split.TrainSet);

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment