Principle:Dotnet Machinelearning Experiment Configuration

Knowledge Sources	ML.NET ML.NET Docs
Domains	Machine_Learning, AutoML
Last Updated	2026-02-09 00:00 GMT

Overview

AutoML experiment configuration uses a fluent builder pattern to specify the search constraints, encapsulating the entire AutoML loop including trial management, metric evaluation, and early stopping.

Description

An AutoML experiment is the orchestrating object that drives the full model selection and tuning process. Before execution, it must be configured with several essential parameters:

Training time budget: The maximum wall-clock time the experiment is allowed to run. This serves as the primary stopping criterion. Longer budgets allow more trials and typically yield better results, but with diminishing returns.
Optimization metric: The evaluation criterion that the experiment seeks to maximize (or minimize). For binary classification, common metrics include Accuracy, AUC (Area Under the ROC Curve), F1Score, and PositivePrecision. The metric choice directly influences which model the experiment considers "best."
Dataset: The training and validation data splits. The experiment uses the training set to fit models and the validation set to evaluate them. An optional sub-sampling flag can reduce training data size for faster iteration during early trials.
Sweepable pipeline: The pipeline search space defining which algorithms and hyperparameter ranges to explore.
Tuner algorithm: The search strategy (Bayesian optimization, grid search, random search, or a custom tuner) that proposes new hyperparameter configurations based on past trial results.

The fluent builder pattern allows each configuration method to return the experiment object itself, enabling method chaining. This produces readable, declarative configuration code where the intent is immediately clear from the call chain.

Usage

Configure an AutoML experiment after constructing the sweepable pipeline and loading the dataset. Use the fluent API to set the time budget, metric, data, and pipeline in a single chained expression. Adjust the time budget based on dataset size: small datasets (under 10,000 rows) may converge in 30-60 seconds, while large datasets may require minutes or hours. Always specify a validation dataset to avoid overfitting during the search.

Theoretical Basis

The experiment configuration defines the optimization problem that the AutoML loop solves:

Maximize:  metric(model, D_validation)
Subject to:
  - model in PipelineSearchSpace(sweepable_pipeline)
  - wall_clock_time <= time_budget
  - tuner_strategy in {Bayesian, GridSearch, Random, Custom}

The fluent builder pattern is a software design technique where each setter method returns this, enabling call chaining:

experiment
  .SetBudget(seconds)
  .SetMetric(metric)
  .SetData(train, validation)
  .SetPipeline(pipeline)

This is equivalent to setting each property independently but produces more readable and less error-prone code. The builder accumulates state incrementally and validates completeness only at execution time, deferring errors to the point where all required parameters should be present.

The metric determines the loss function that the tuner minimizes. Internally, metrics are often converted to a loss value (typically 1 - metric for maximization metrics) so that the tuner always minimizes:

loss = 1 - metric_value    (for metrics where higher is better)
loss = metric_value         (for metrics where lower is better)

Related Pages

Implemented By

Implementation:Dotnet_Machinelearning_AutoMLExperiment_Configuration

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment